Author: Marco González
Editor: Randall Roland
Node operators, Antelope core developers, and community members get together each week to discuss the captivating questions of the day. The primary objective of each Node Operator Roundtable is:
“...to improve the Antelope protocol (specifically) for node operators”.
Meetings occur every Wednesday from 14 UTC to 15 UTC (13 UTC to 14 UTC during daylight savings). The EOS Network Foundation provides tutorials and documentation for those wishing to learn the basics of operating an EOS node (and more).
Below is a list of the two roundtables contained within this bi-monthly summary:
June 21: P2P Improvements Document Listing the Problem, Solutions
June 28: Block Trimming, Planning for Leap 5.0, IF, OC
June 21: P2P Improvements Document Listing the Problem, Solutions
The meeting on June 21 continued the P2P discussion. A GitHub document answering some of the feedback in recent weeks guided the discussion.
Overview
The guiding document clearly defines the problem. Defining an acceptable solution aligns closely with previous feedback.
Updates
the Leap 4.0.3 patch was released during the meeting
Breaking Down the Guiding Document on P2P Improvements
The guiding document page breaks down into several sections: the Problem, Solution, Issues, Resources, and Comments.
Problem: Opportunities, Audience, and Strategy
There are different needs for targeted user groups. Priorities are now better understood.
The identified audience is Antelope-based and:
“technically proficient individuals” (node operators) concerned with “reliable, cost-effective, and simplified solutions”
To align the problem with core development, there needs to be a focus on “improving the user experience and infrastructure efficiency”. Successful implementation expects to yield “wider adoption and success of the Antelope protocol.”
Returning to the opportunities of target users, the top priorities are as such:
The highest on the list is to “improve catch up mode peer selection for available blocks”.
Bandwidth has been a common discussion in recent weeks. Improving controls of “peer bandwidth consumption” is the next priority. Avoiding abuses and automatically rotating peers are focal points.
The third major priority is the “ability to label peer connections as internal or external”. Succeeding here expects to solve “trust (levels) within internal infrastructure”.
See the document and/or video for descriptions of other issues including a terminal-based UI, peer blacklists, and syncing.
Defining a Solution and Identifying Risks
The name given to the product opportunity described above is “P2P Improvements”. The four primary metrics for success, as listed on the GitHub document:
Improve sync time for a new or restarted node
Reduce the number of steps to get a new node configured
Improve the reliability of transactions that transit P2P network
Improve the speed of transactions that transit the P2P network
The process of identifying risks is guided by an article (The Four Big Risks) written by the Silicon Valley Project Group. The question asked centers on P2P Peer Discovery RFP regarding parallel timelines:
“Is there any risk of conflict or overlap?”
See the documentation page for more resources and issues.
Meeting Comments and Feedback
Participant responses include:
internal nodes and bottlenecks
storage speed is critical
bloks-log mentioned as a concern regarding reading and local peers (e.g. number of operations and syncing)
limited syncing options
bandwidth vs. range of state (may not be an issue if sufficient bandwidth and rotating peers)
Concluding Dialog
The meeting concluded with clarifying questions. An inquiry was made about throttling (slowing down) as a bandwidth management strategy. From the feedback, throttling is considered an option as opposed to disconnecting peers.
The development team invites node operators to help steer the course of the next steps. Attention was called to the task breakdown (“Features/Epics”) section of the document page. Bandwidth and control mechanisms were mentioned here.
June 28: Block Trimming, Planning for Leap 5.0, IF, OC
The June 28 meeting identified block trimming as an area to improve. Planning for Leap 5.0 includes preparation time, new standards, IF, OC, and more.
Overview
Node operators shared their experience with the current block-trimming process. Diagnostics, automation, and more solution types were discussed. The last third of the meeting discussed planning and preparations for Leap 5.0.
Updates
there’s no new update on Leap 4.0
the development team continues to rework a 5.0 tentative schedule for a release candidate
More about Leap 5.0 in a later section.
Community Concerns
The floor was open to hear the concerns lingering on the community’s minds. The discussion began around trimming blocks and differing issues for the current 4.0.3 vs. 5.0 expectation. Block trimming garnered immediate interest. Using the leap-util to trim a solution for a bloks-log exception drew several comments. The need to extend trimming to several hundred blocks seems like it could be improved. Smaller trims that can identify the errors (exceptions) are an attractive feature to add.
Topics Discussed in More Depth
Two topics seemed to resonate out of the opening discourse:
formatting and contents of leap-util (e.g. smoke test)
need to roll back bloks-log
Below is a list of specifics mentioned. List items follow a general chronological and logical order as recorded in the meeting:
display vs. functionality
potential improvements for minimum trimming
ensuring a secure solution
suggested adding successive (one-by-one) trimming
a trimming-diagnostic tool could determine corrupt blocks
auto repairing solution that’s careful not to auto-remove blocks
public vs. private node solutions
complete fix via snapshot refers back to cautions concerning public vs. private nodes
The community seems to want easier diagnostic and recovery tooling. For more details, visit the leap-util is unable to successfully run a detailed smoke test and trim the blocklog. #1348.
PLANNING FOR LEAP 5.0
As previously mentioned, planning for a Leap 5.0 release candidate has begun. The development team works to mitigate the long, complicated process involved in a consensus upgrade. Leap 5.0 will establish a standard procedure. A couple of months are needed to prepare the community. Plans focus on a pre-holiday release, hence beginning a rollout in September.
Expectations are that Leap will undergo two major releases per year. One of which likely requires a consensus upgrade. Reducing node operating costs is among the chief concerns. This includes cost reductions for node operators when adding new chains. Comments were made about the benefits of experience and optimization (RAM, CPU, and disk space). The importance for more people to upgrade to 4.0 was stressed.
Two primary protocol upgrade features for 5.0 are:
Instant Finality (IF)
Optimizing Compiler (OC) features
IF seems to be the make-or-break point of the two. IF remains a key initiative of the ENF’s initial plan for the New EOS. OC features are performance benefits primarily meant to improve EVM. However, there are overall OC benefits that include producers. See Add eos-vm-oc-enable auto mode #1322 for more information.
Other items are developing in parallel, expected in future releases so as not to delay 5.0.
Concluding Dialog
The meeting concludes with an examination of OC features for 5.0. Current development is concerned with fundamental improvements. OC should first run on eosio system contracts (e.g. for tokens and EOS EVM) before the other features mentioned. New features and preparing for Leap 5.0 likely remain the focus of future meetings.
Sources & References