In this post we will look at a stacked ( meshed ) vdom routed concept. This is a means for the creation of vdoms for departmental isolation and with a share vdom that routes our traffic to the public wan.
In case you haven’t noticed, there has been a lot of companies buying each other. There are a lot of reasons for this, low interest rates, SOX, exchange rules, etc. One other reason for the high level of consolidation is thanks to IT systems. Combining the resources of two companies may make sense from a bottom line, but I doubt this could be done at the scale we see today without IT systems to help facilitate.
(If you want to read more on corporate consolidation here is a good article)
IT & Corporate Scale
In nature, when an organism gets larger, it takes on more complexity. Some organizations fit small niches where they have advantages by being simple (single cell organisms), well others benefit from a very complex arrangements of interdependent groups of cells. Companies work on similar principles, some work best when small (restaurants), well others require a massive organization of people, capital and geography to be successful (automotive industry).
Much like animals, companies need a connective tissue to organize their various functions, people and locations. In nature it is the nervous system, in the corporate world it is IT systems.
Then & Now
Lets take the example of calculating profit for an enterprise. Executives must collect information about expenses, sales, assets, taxation, etc to come to a final number. This information may be spread across hundreds of employees, datasets and multiple locations. Without IT systems this may be done by creating paper based book keeping, invoices, etc, collecting them all and sending them to a head office to make sense of and produce one final report. Obviously this would require significant manpower and time. In this system, you can imagine that a company can only grow so much before the administrative burden would diminish any real grow in the company.
Now fast forward this system to the modern day. Computers can handle much of the administrative work that would require large teams of people to do. Invoices created and paid automatically by computers, balance sheets generated in real time, forecasting done in seconds and more. If you want to grow your business you just need to modify your systems.
Buying a business just means integrating their systems with yours. Personal changes are inevitable, but taking two payroll departments and combining their efforts is more of a systems challenge now, rather than a people issue. Due diligence is a lot simpler as more data is available, and in the future will likely be done mostly automatically. In the end buying a company is a lot simpler and costs much less than in the past.
M&A In The Future
Companies generally buy other companies in order to grow, enter new markets or stifle competitors. What you end up with is people, processes, locations and IT systems. For Walmart to buy Target, that would mean a lot of people and places. What about when Facebook bought Instagram? They really bought a IT system for images and video. The developers were valuable, but it wasn’t like they got 1000 new employees. This is really a sneak peak at most future M&A deals, it will be about systems!
We are really seeing a big change in the corporate world. The world may very well look like a dystopian scifi novel, where only a few massive corporations exist. Computers may be a means of disruption, but they are also a means to consolidation.
I am keeping this list to track articles with useful info about networking in Azure instances.
I am keeping this list to track articles with useful info about networking in AWS VPC instances.
Consider the case where a workstation in Building A is sending data to another network device in Building B. The ground potential of each building will be a function of the impedance of its ground system and the current flowing through the ground. The data line, in addition to carrying data, is also connecting together the ground systems of the two buildings. If the ground potentials of Building A and Building B are different, a ground current flows in the data line. This is known as a ground potential difference. The voltage level of the data signals is increased or decreased by the ground potential difference, causing data transmission errors.
I have felt for a long time that public network gear testing should be done in the Networking Community. This is a fairly common practice for things like video cards, SSD/HDDs, CPUs, etc, yet it very rare in the networking area. In the past this lack of testing can attributed to the difficulty of getting equipment, setting up and running the correct tests. RFC2544 gives us a simple set of test parameters to run to accurately gauge performance, but the results of these are often given out by vendors or individual accounts, all of which are not great to get a simple baseline of who performs the best. These factors, and more, have meant that the IT professional has a hard time comparing various platforms.
In recent years NFV has created an opportunity to change the testing paradigm and adds new questions about performance expectations. It is now easier to spin up routers and test them, since all one needs is a x86 CPU and VMWare. On the flip side, it is now harder than ever to understand the performance between these various platforms. If a vendor says it can route one million packets per second on a software platform, does that mean it can achieve such results on all CPUs? Since the customer is expected to provide the CPU to run the router, new questions emerge about performance.
Running a router on a generic x86 CPU will have a large impact on how network professionals evaluate platforms. Currently, x86 can only scale one thread so much before a limit is reached, you can only buy so much performance. Due to this limitation, IT professionals may be required to add more CPU cores to improve performance, however, this will certainly provide diminishing returns. Many features may not be programmed for or lend themselves to being multithreaded, BGP being a common example. The community needs a resource to broadly compare CPU scaling among platforms to inform decision making and allow for better discussions.
I plan to create a singular testbed to fairly compare as many x86 routing and firewall platforms that I can. I’d like to produce results that clearly show how effective CPU scaling across multiple cores is using a consistent testbed. In the end, we should be able to look at the results and understand how NAT, routing, BGP, etc perform on x86. I’d like to solicit as much feedback from the community as I can as to create a valuable, valid and fair comparison among various routers and firewalls.
- Understand how various routing platforms scale across multiple CPUs.
- x86 CPUs have limits as to how far they can scale up (single threaded performance). Adding more cores is often the only practical way to improve performance.
- Most IaaS clouds don’t allow you to pick CPU models, so increasing single threaded performance may not be an option.
- Having multiple cores for NFV hosts may be seen as a way to improve performance down the line. This might be wishful thinking.
- Create a methodology for testing various software routing platforms that would be applicable to real world use (not 100% real world).
- Results should give an idea of router performance and feature performance comparative to other vendors.
- Allow for other IT professionals to create and publish valid performance benchmarks.
- Create valuable performance data for the networking community.
- Vendor performance metrics are considered suspect in the community. NFV makes anticipating performance very difficult due to an increase in variables.
- Results are found either through customer demands, expensive PoC, and word of mouth. This sucks.
- Build a lab that can produce reliable, sensible results
- Need results to be valid, can’t have interference from external factors (NIC issues, CPU contention, etc)
- Should follow how vendors expect it to be configured in the field. Some recommend special settings in VMware.
Things to Test (start simple)
- PPS at frame sizes IMIX,64, 1024 and 1518. Test latency during this time. Max PPS achieved when packets are dropped.
- See how performance will scale between 1-8 cores
- Some frame sizes might be hard to test due to interface rate limits. Using a slower CPU might help.
- Test using IPv4 and IPv6.
- NAT performance. Test latency during this time. Max PPS achieved when packets are dropped.
- See how performance will scale between 1-8 cores
- Test with just 64 byte packets?
- BGP Performance
- BGP is often single threaded which is a issue for many users
- Don’t know how to test BGP performance 🙂
- What else should I test? Should be fairly simple to test and fair to all platforms. Not sure about testing VPN performance, might hit TGen limits before router’s.
Test Platform Options
I have built out two possible test rig designs. I am really not sure which one will produce the most valid results. It may very well be that they both will produce equally valid results, but I am hoping I can get some community feedback that will confirm this.
As I go through this, I believe that using 10Gb NICs may not offer enough interface speed to find the CPU limits on many platforms. Using a slow or underclocked CPU may help with this. I don’t believe that this would make the result unfair since all vendors would be subject to the same CPU. The goal is to understand scaling, not maximum possible performance.
I have most of the hardware for this right now, although I need to do further research to ensure the CPU I select for the router will produce valid results. A major concern I have is that performance may be very different thanks to CPU features introduced in later generations. I know this would matter for AES performance, but I am not sure if it matters for what I want to test. I am looking for feedback on this point.
I plan to test all routers on VMware VSphere. I believe that most people will be running software routers as VMs on this platform, so the results should be acceptable. VShphere is also free, so that makes my life a bit easier.
The NIC on the host will be a Intel x520 NIC. This is a very common NIC from a reputable vendor. I don’t think this should skew results.
- I’d like results to be similar to how video cards are reviewed. Lots of graphs comparing different platforms with each other. Readers should get a decent impression by just looking at the graphs.
- Present any interesting details or platform quirks.
- Must be fair to all vendors. This means that the specific CPU I test one router on must be used to test all comparison routers. Can’t introduce something that skews results.
Questions For The Community
So I am a undecided currently on the how part. I’d like to use software that is the simplest to operate and can critical results (PPS, packetloss, Mbps, etc). I also don’t know how to test BGP performance, so I need some help with that.
- For PPS testing there is a few options
- BGP Testing
- Any tools for testing BGP convergence?
- Any do’s and don’ts for this?
- When testing NAT scaling should I use anything above 64 Byte packets? I don’t think testing above this would matter for the results I wish to produce.
- Any advice on the testing rig?
- All thoughts are very appreciated
You can skip this section you like, but this should help readers who are wondering why I would do this at all.
- Learn more about more platforms in a directed manner. This should help give me a purpose to working with various platforms in new ways. I am not the type of person who can spin up a lab and work all day understanding a feature just for kicks.
- Learn more about technologies/vendors I don’t have a professional reason to practice due to current employment.
- Give back to the community. I get a lot of value from various networking groups, I feel like I should give back something.
- Learn more about testing. Lots of networking stuff is only really tested in production.
- I grew up watching Home Improvement so I like more power!