Author: Markus Hinrichs
Editor: Randall Roland
Node operators, Antelope core developers, and community members meet each week to talk about the network and its development. The primary objective of each Node Operator Roundtable is:
“…to improve the Antelope protocol (specifically) for node operators.”
Roundtables occur every Wednesday. Visit the Telegram channel for information about joining. The EOS Network Foundation provides tutorials and documentation for those who want to learn the basics of operating an EOS node.
Below is a list of the roundtables contained within this bi-monthly summary:
November 29: Leap 5.0 Update Schedule, Prometheus Metrics & Dashboard, and more
November 22: BLS, RC3 Status, Instant Finality Progress for Leap 6, Block Log Stride Setting and more
Please be sure to look for additional meeting notes and comments on GitHub. Videos reside on the ENF’s YT.
November 29: Leap 5.0 Update Schedule, Prometheus Metrics & Dashboard, and more
Leap 5.0 Update Schedule:
Anticipation for several imminent releases, particularly RC3 5.0 scheduled for December 6th.
Mention of patch releases (3.2.5 and 4.0.5) due to compatibility issues discovered during the coding process for 5.0.
Stabilizing 5.0 After RC3:
The aim is to move to stable quickly after RC3, foreseeing no significant changes between RC3 and the stable release.
Avoid releases before the American holidays and aim for a stable release by January 3rd, if no issues surface with RC3.
Call for Testing RC3:
Request for increased testing on RC3 once available, specifically a call to action for individuals to install it on API nodes and report any issues to ENF.
BLS Activation Readiness for RC3:
Highlighted the necessity for BLS activation readiness before the RC3 release.
Prometheus Metrics and Dashboard:
Discussion about creating a dashboard from the metrics, with a list shared with the DEV team for feedback.
Emphasis on the need for more explanatory guidance, considering external users unfamiliar with the codebase, suggesting the importance of units explanation similar to nodeos for clarity.
The discussion primarily revolved around imminent release schedules, stability plans post-RC3, testing calls, BLS activation readiness, and considerations for improving explanatory guidance in Prometheus metrics for better user understanding.
Duplicate Agent Names Issue: The challenge with duplicate agent names in config.ini causing transmission issues, often due to unnecessary quotes in the agent name field.
Agent Name Generation: Considering enforcing uniqueness in agent names or generating random names to resolve transmission issues. The possibility of automatic updates based on enabled features.
Proposed Solution — Connection ID: Introducing a new config.ini option called “connection ID” to resolve duplicate agent name problems. Discussion on generation during startup and potential challenges with file writability.
Config.ini Management: Long-standing feature request for splitting config.ini across multiple files, akin to Debian DOD files, to manage varied node configurations more effectively.
Usage Preference — Config.ini vs. Command Line Arguments: An observation on some preferring command line arguments over config.ini for ease of visibility and running integration tests.
Visibility of Configuration Options: Discussion on the benefits of seeing configuration options on the command line versus config.ini and the potential need for an endpoint to display non-default options.
Port Separation and Security: Benefits of separating ports for security and easier management of public-facing nodes. Challenges and adjustments encountered with upgraded nodes.
API Endpoint Issues: Addressing issues like the visibility of certain protocol features in API nodes and proposing solutions within different node groups for better access and control.
November 22: BLS, RC3 Status, Instant Finality Progress for Leap 6, Block Log Stride Setting, and more
Technical Discussions and Updates:
Discussion on BLS intrinsics patching and its impact on node functionality.
Mention of a potential bug during updates and testing for potential fixes.
Anticipation of changes in RC3 and progress on testing strides for block logs.
Weekly Node Operators Roundtable:
Overview of updates on the 22nd of November 2023, including BLS changes and RC3 status.
Discussion on instant finality progress for the 6.0 release and challenges faced during redesign phases.
Block Log Stride Setting Discussion:
Exploration of the block log stride setting's backup and data recovery benefits.
Mention of potential bugs in the state history stride setting and plans for fixes.
Insight into the benefits of striding blocks logs for storage optimization and archiving.
Verification and Debugging:
Consideration of checksums and file integrity in striding logs to prevent corruption.
Discussion on deterministic ship logs, enabling/disabling console logging, and its potential impact on performance.
Console Logging Impact:
Insights into the evolution of console logging from debugging to a functional tool.
Caution about overusing console logging, its impact on layer 2 solutions, and previous instances of excessive logging causing issues.
This conversation touched on technical aspects, updates, node operator banter, and discussions about various settings and their impacts on node operations and performance.
Block and Ship Log Operations:
Discussion on splitting block and ship logs for efficient processing.
Issues with the block log becoming invalid after power loss or abrupt shutdowns.
Insights into potential reasons for block log corruption or trimming.
File Sharing Services and Log Benefits:
Consideration of using file sharing services for block and ship logs.
Benefits include easier full node setup, easier archiving, and cheaper storage solutions.
Potential for shared logs for multiple nodes, improving scalability.
Log Splitting and File System Limitations:
Cautionary notes about file system limitations with a large number of files in a directory.
No significant performance trade-offs observed in stride size variations (e.g., a million blocks vs. smaller strides).
Challenges with smaller stride sizes causing issues in file systems.
Log Write Operations and Recovery:
Understanding the log write operations and their association with reversible and irreversible blocks.
Queries about potential issues in log writes during abrupt shutdowns, leading to log file trimming.
OS Cache and Log Write Completeness:
Consideration of OS caching and potential delays in log write completeness.
Speculation about reasons behind log file trimming for recovery after power loss.
Future Improvements and Cautions:
Ideas for potential future improvements in handling log writes and recovery mechanisms.
Consideration of operator error and unplanned shutdowns affecting log integrity.
This conversation delved into various aspects of log operations, file system limitations, potential issues with log writes, and ideas for improvements in log handling and recovery mechanisms.
Sources & References