Jump to content

dfleisch

Members
  • Content Count

    13
  • Joined

  • Last visited

  • Days Won

    1

dfleisch last won the day on November 14 2018

dfleisch had the most liked content!

Community Reputation

2 Neutral

My Information

  • Agent Count
    Less than 100

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. dfleisch

    Running Automate in Azure

    The real-world performance differences between tasks on a SAN vs Internal SSD are _not_ likely to be easily quantifiable IF the SAN is optimized for the work-load. We are talking about workloads that may or mat not take advantage of certain aspects of storage. Workloads that are common: MySQL. Differences should be smaller (at least when you spend that much on a network attached storage solution, it SHOULD mirror a SSD to a point) To clarify, or to be more scientific, by reviewing the impact to boot times and or application ready-time in seconds, an individual will generally say, OK this will work, and go for it, but metrics dig much deeper and are not often visualized by subjective 'feel' vs results and benchmark numbers to clarify how much better or worse. Besides all that, Automate works better or worse with different hardware, like any software. There will be only a small, or couple second difference in OS boots and normal less intensive and latency sensitive apps, where nothing would have been changed for the user's opinion about speed, so the slower SAN may not be flagged for replacement, or even worse it was purchased in the first place with the main goal to replace everything local storage wise in the company.. Meaning the money spent may have been better spent elsewhere for certain applications. From my understanding, most IT companies make the their storage work for them based on application load time goals: Does the solution work for productivity goals (speed), costs of ownership, and disaster recovery plans? Most check YES and purchase on this basis, but may not deep dive in bench-marking every app and it's speed difference. --This is a reactive situation: Can we improve on x, y and z, all while not changing real-world feel in general? --Since we spent x,000 $ on this solution, why not move everything to it? They claim 100 users can use it without any impact to performance! With Database type work-loads, SANs no longer work 'as well' per $, as they incur: Additional latency / overhead by adding another layer on top of the on-disk storage controller, in between the client and the disk there is now a CPU, RAM, SATA Controller, as well as software overhead. --This reply is NOT going to educate you on a SAN's downfalls, nor knock down a SAN for a storage solution. It is simply here to explain what is fact based on what we see, NOT to quantify how much of an impact 'no longer work as well' has on performance vs another storage type. Would you purchase a SAN based on the fact that it has enough space for the entire company, meets your backup and GUI design requirements to configure and manage it, saves you money, AND has enough IOPS for your productivity goal, then just take the plunge? -Most would, not saying anyone who does is not smart, but there's a little more to it! Any SAN / NAS / storage device has limits that are physical constraints based on design due to additional moving parts being added. Any way I slice it, MySQL seems to works better if you have a drive that does fast random IO writes. --So ask yourself, has a test been run to quantify the $20,000 SAN as 'faster' or 'cheaper per GB' (with redundancy built into the equation) vs another route? Is the choice done for convenience? For apps needing more performance (for whatever reason) sometimes there is a convenience tax, and that tax can be new hardware or a more expensive monthly investment. In terms of optimizations where 'MOST HARDWARE' will work, I think most hardware will work fine. Because these applications create thousands of calls per second when busy, performance is not free. So it goes both ways, hardware and software needs to work together, and performance is a result of that effort. Specific things like high latency just make it crawl, so this is why I am using a SAN as an example of what may also show weakness in applications that rely on MySQL. I don't think Azure is specifically showing high latency, it's just a optimization choice on there end. That's what led up to Diskspd being used, set a min spec and if it's not met, lets try and meet it with MB/s AND latency as a product equaling the IOPS If a server is under-spec, it shows a low result in most cases Most issues are because of a hardware configuration, SATA controller hardware choice (that sits between the OS and Physical disk), SAN / NAS (shared storage) latency spikes at random times which is VERY hard to track, (or something else) , all of which create negative impact. The ones that do and don't look to scientifically measure and estimate the numbers will certainly appreciate it in more way's I can express! --HOWEVER, Exceeding the goal is more desirable for resolving an issue with performance impact affecting the server, vs being under spec, or just at spec, or near spec, IMO. Comparison of results on servers without performance issues, mainly in order to establish a min spec goal can be done to estimate the needed speed and type of disk. With a standardized benchmark run, if a server scores '800 IOPS' and then it's upgraded and scores '10,000 IOPS', we can effectively say the transaction will complete QUICKER, but defining HOW MUCH quicker, or 'IF THE PROBLEMS WILL GO AWAY' will not be as cut-and-dry. Here are some ideas I have: Measure examples of workloads produced from HUGE (10k+ Agents), Large, Medium, and Smaller (<100) Automate Server's by generating IO patterns that simulate a live box, then use some logic to suggest hardware and or IOPS mins based on these specs. This can ensure query completion times are met and exceeded for 'min vs recommended' specs. As for MySQL RAM Allocs: Caching 90 - 100GB of an on-disk datadir within RAM when you have 128GB of RAM may be a reality to some, but for others it's less than ideal to have to drop $3k on RAM. --This goes along with 'what we typically see' ; NO you don't need 128GB of RAM and a 100GB buffer if the DB is 100GB ; Have I seen servers with that? Sure. Were they fast? Not always Ideally, 64GB RAM / 50GB Buffer on a 100GB on-disk DB size is great to have.. -- Is it always attainable, or needed? If it's not, why? Well, these questions are often answered by $. Spend more money, get x% better performance, or spend more time adjusting and logging data, and get a better result. Where does one cross the line and say, $100 is not worth 1%, I'll stop there? With advanced benchmark techniques maybe information can better define this. I do have a MySQL specific 'suggestion' for innodb_buffer_pool_size, in the form of a query which I've borrowed off of Stack Exchange which may help clarify: --------------------------------------------------------------------------------------------------- Recommended Buffer Pool Size\\ --> https://dba.stackexchange.com/questions/27328/how-large-should-be-mysql-innodb-buffer-pool-size/27341#27341 This will give you the RIBPS, Recommended InnoDB Buffer Pool Size based on all InnoDB Data and Indexes with an additional 60%. SELECT CEILING(Total_InnoDB_Bytes*1.6/POWER(1024,3)) RIBPS FROM (SELECT SUM(data_length+index_length) Total_InnoDB_Bytes FROM information_schema.tables WHERE engine='InnoDB') A; ----------- 'More Concise formula': ----------- SELECT CONCAT(CEILING(RIBPS/POWER(1024,pw)),SUBSTR(' KMGT',pw+1,1)) Recommended_InnoDB_Buffer_Pool_Size FROM ( SELECT RIBPS,FLOOR(LOG(RIBPS)/LOG(1024)) pw FROM ( SELECT SUM(data_length+index_length)*1.1*growth RIBPS FROM information_schema.tables AAA, (SELECT 1.25 growth) BBB WHERE ENGINE='InnoDB' ) AA ) A; --------------------------------------------------------------------------------------------------- I am not aware of the fact that the test is forcing a Sequential Write with w100 and that other variable. Maybe you can make a suggestion otherwise. I didn't get a chance to really look into this yet. Are you saying the syntax is not outlining real MySQL performance differences by the numbers when run on multiple servers? From what I see, all servers (we run the test on) which are slow, end up below the min spec of 3000 IOPS. A test that is more tailored to workloads vs synthetic tests, would be nice!
  2. dfleisch

    Speed optimization of On-Prem Automate

    Yes, by design. If a deadlock happens it will wait on the write speed @ QD=1 to complete then release, allowing others to process. The test takes advantage of this fact and makes sure to test a worst case scenario @ 16k write. We really need to be testing 4k blocks instead as after some time graphing MySQL servers running our application, the 4k block is used and NOT 16k as previously thought.
  3. dfleisch

    Running Automate in Azure

    We are worried about deadlocking and per query execution times and I will explain why. The process loops in the software which requests certain queries are processed necessitates completion before others can process. Simply put: You're waiting on one query because it locked a table to do updates, if you cannot process that query faster than 5MB/s and it needs 40MB/s to complete in 1 second, then your server is not going to magically work UNTIL you have faster commit performance @ QD1 SO while a high IOPS number @ QD32 helps recovery time after a deadlock by being able to clear the queues quickly, we are only worried about a single queries write performance in the moment of the deadlock, SO we don't care of you can do 1MB/s x 32 32MB/s @ that queue depth because we are waiting on the 1MB/s PER THREAD that is bottle-necking the server queues. Take for example the query which rebuilds EDFs for computers. It locks the computers table and agent check-ins (for a 10k agent these can pile up pretty quickly) until it's done. On a slow per transaction IOPS server, these queries are larger and higher in row count so it's going to take between 11 seconds (fast server) OR [up to 2-3 minutes] on slow server How often do you want to see 20,000 requests in IIS waiting to commit to the table if these queries run multiple times a day ; and extend up to 3 minutes for a EDF rebuild? Answer: You NEVER want to see that many in the queue. IF you do, that performance at the higher queue depth certainly helps, and it's not like we are asking you to be WITHOUT acceptable performance at higher queue depths, just that the per thread performance is the problem and if you don't meet our minimum, you will be affected by lock outs and reliability concerns. When that rebuild runs (like many other queries which do something similar, like for commands) ..the 2-minute gaps in processing (locks) for INSERT queries for agent-check-ins are going to make the server not respond while the IIS queues are high BY DESIGN of IIS. Soon 3306 is going away and ONLY web calls will be made for the fat-client, even local. We can get around these pauses in web-server response times by using various tweaks to IIS and MySQL, but in the end we are simply band-aiding the real issue. While waiting for a single query to run on a base Azure instance, sure the server can queue 32 other items @ that same data rate and complete them, but unless the query is re-developed and re-released (hint: it already has been many times) with code that prevents locking, or creates multiple partial locks instead, or doesn't lock at all, we are at mercy to this and other queries with this design; --The requirement is there simply for a worst-case scenario Core query, in-house or third-party plugin, Custom or stock monitor, and or script that may not be completing quickly *for whatever reason*. **IMPORTANT* ALL Environments are not the same, people use various custom queries. SO, for a core product min recommendation for disk performance, to set a standard of how fast the product works, we make a generalization that everyone will be fast enough to write that 16k block at a certain speed as EVERYTHING done in the application or web interface relies on queries. We have set the 3,000 IOPS number, not only because it's a easy spec to meet, but because it affords smaller or medium size servers to complete any straggling 'locking' or full table scan type queries a quick-enough turn-around in speed where major delays are avoided. Power-users who accept the fact that they want a 'very fast' or 'faster' product will go above and beyond this spec. Yes, There are limits to application design and efficiency based on design choices, but wouldn't you rather have a server that can process past a worst case scenario rather than one which isn't capable? While the product continues to evolve, we are eyeing different ways to further improve performance and speed, but the simple fact is: a SINGLE 7200RPM SATA disk runs 3200 IOPS @ 16k Random Write, QD=1 and Threads=4. Why are excuses being made in order to work-around the fact that the server NOT meet this spec (which every bargain bin single spinning disk based consumer desktop will do) since higher density drives came out past 1TB mark have been a thing since what 2008? Now, Keep in mind that MySQL 5.7 currently does support 32 and 64K pages vs 5.6 allowing only 16, 8, and 4k.. so many you'll gain some benefit there at expense of wasted space if Azure scales better under larger block sizes on a QD1 scenario.
  4. Mike, please create a ticket for the server down / ERT team to review this you, I'll take care of top to bottom troubleshooting on this vs Support. We are confident we can identify an issue but we do need to start a remote session / call to id a few points before starting. David Fleisch ConnectWise Automate Support
  5. dfleisch

    Running Automate in Azure

    Michael, The Azure Temp drive has good IOPS and is not persistent, so it seems it could be a RAM disk. The temp directory for MySQL can be used here, but if it crashes and the drive is not available even to show the files that got force closed (may be corrupted), the MySQL log will be littered with errors about #sql files missing, and this can lead you down a dark path of inflation of IBDATA1, resulting in long-term performance impact. DB rebuilds are done often due to inflation, so I would avoid this at all costs unless you're ready to rebuild the DB a lot.. Other points.. we suggest 50% alloc of on-disk data-size in-RAM's Buffer_pool_size. 10GB DB on Disk (excluding logs) ? 5GB Buffer for MySQL OR higher, is best. On a 32GB server, we use 21GB for a buffer pool size since.. 20% overhead .. and Windows / Other processes need ram (including windows caching routines). If your 21GB buffer on that 32GB server has a 42GB DB, technically it's meeting best practice. For larger DBs, or DBs that need a larger RAM buffer to cache queries hitting a table that may exceed 50% of the on-disk (total) data size for \Labtech\ may need further optimization or more RAM. The "buffer_pool_instances" goes along with buffer pool size. Less instances and lots of connections? Thread Lock Contention occurs. Buffer pool instances should be 40 if 40GB buffer is used, but 39 and lower would also work.. 41 instances and 40Gb buffer = no no since it doesn't match the 1:1 (or less) rule of 1GB (no less) or more per buffer pool instance rule. Max value is 64 for this line above in quotes. I would prioritize disk performance and 4 cores FIRST, then work on more than RAM. The SSD disk (real ssd) and 4 Cores / 16GB RAM is a min spec I would set for servers with a 0~20GB DB sizes
  6. dfleisch

    Running Automate in Azure

    Hi Michael, It seems to me like the MySQL server instance is programmed to use 4k and 16k sized blocks for reading/writing data from tables. I ran HDTune Pro to get an idea using their Disk Monitor tool to see what block sizes were most popular while the application was running. Give it a shot and see if you can isolate the same! https://www.hdtune.com/download.html While our other services do things besides talk to SQL, they still mostly rely on MySQL for a response back through a query most times, and that is the bottleneck. We need to do more testing to confirm some more detailed items, but this for now should give you a general idea what is used by the software calls.
  7. dfleisch

    Running Automate in Azure

    @tlphipps Please forgive me if I came across this way. I read through the thread and wanted to put out a message to everyone first: My goal isn't to insult anyone about their choice in providers, heck that's a business decision more than anything. This started as a small post but I want to make a few items clear about our approach and for the reason tests are being run. *To help you guys stay up and stable!* for one.. The goal of my last post was to call out and describe what works and what doesn't, and to forewarn someone new or seasoned who may not be used to looking at these numbers and troubleshooting an issue caused by IO. First, the volume of IO calls made by Automate is massive. The number and variety being pretty this complex creates issues, but sometimes they are harder to identify as each server is different in how the problem exposes itself and how it manifests. Your first line of defense is to meet requirements. Automate 2019.x + Automate 12.x runs a variety of services that require a certain min spec. The CWAFILESERVICE, Solution Center, Startup routines of LTAgent, and more all require a low latency and high throughput disk. Most IO issues are first exposed when the server is first rebooted by way of timeouts while all agents are ALSO trying to check in, and I've seen 10,000 in web-garden type IIS queue buckets that eventually creates a 100% CPU problem and slows the queries further. If your server cannot process this many IOP (minute) it may take 30+ minutes to 'settle' down or a 'rename' of the eventlogs table at worse, before the startup of LTAgent finishes. Without a LTAgent startup/POST, you may never get the server online, so these issues occur because it couldn't process x data in x time. Add that into custom monitors and wild variety of query calls that vary per company and you've got a recipe for disaster in cases where IO is not plentiful per thread, and a thread deadlocks. If you've bought into Azure and have it working, I am not going to tell you move off it! Running hardware hosted by the makers of Windows can provide lots of benefits and advantages vs other providers ; I just don't wish to see a partner reaching out to Microsoft, and going through all this trouble to find out what is going on with performance when a design choice at the provider (one which they will defend) has driven the IO to be optimized for another purpose outside of Automate's needs on the storage plan that fits their budget. Seeing that the SSD stripe options may provide the performance required, this is good to have data on the instance size / type as previously this was very cost prohibitive. What I'd like to raise awareness of is the fact that it takes very little to create a server down situation by backing up the web server when IO demand vs availability is exceeded on a *per thread performance basis*. The threshold could be close to being exceeded as the DB's table row counts and data-set sizes increase, and for the cost that one would spend, you want a much larger window of opportunity to increase load IF the numbers are anything like what I saw back when they released the Azure cloud. To give yall an idea how bad it was, I've attached two screenshots. --First is a Azure server in 2015, notice lack of write speeds and 4k performance with results that would likely fail out DISKSPD.exe test. --Second file is a newer Azure server from 2018, with higher IO plan but still showing only 60MB/s Sequentially (4TB SATA drives are at 150MB/s today) 16K write MB/s looks better here. Latency measurements were not taken. Since these screenshots pre-date the diskspd.exe test we use, the resulting measurement of MB/s x ms Latency = IOPS was not available to get a clear picture. Personally, I have not done the numbers on a newer 3x SSD Azure setup and I may have got a bit ahead of myself. The current VMs may have more headroom with the improvements MS has made, but all I am saying is: just be careful!! If the plan works for your business and provides features and security you aim for within the budget with performance looking good, that's all that matters. In my eyes, the bottom line is: 1. We've troubleshot Azure performance issues from day one, got a bad taste for their servers as result. --Azure has improved, how much? Hard to say as I don't have raw numbers / data. Just speaking in general ( would be good to run some benchmarks on different tiers ) 2. I dislike the trend of a growing company running into issues with Connectwise after specing the server to our 'recommended' specs on the Documentation site, then finding later they are going down due to a infrastructure scaling / hardware problem. One of the great things we've done as of late is to develop provide a standardized benchmark to set a IOPS bar and hope that you wish to keep that headroom. Happy MSPing out there!
  8. dfleisch

    Running Automate in Azure

    The command we give consultants to run to pre-check a server runs a test using a Queue Depth of 1, Microsoft is likely able to scale the Azure disks to run well with 32 or higher queue depths, but MySQL is not going to behave like that. MySQL will use a 4k and 16k block size on the disk and write (randomly) at a single queue depth. The test we give you to check the server is going to validate if the server is able to sustain the required IOPS on a single MySQL query, something that varies widely depending on hardware setups. From experience, we are not able to to a EDF rebuild, Search Rebuild, or other intensive tasks below a certain number of IOPS on that 16k random write, 4 thread test syntax for Diskspd.exe, and as a result server down tickets come in with complaints that the application is not usable. To give you some history, back when Azure came out, people were running the base storage plans and not able to meet specs on these 16k and 4k block sizes, but if you've upgraded to a SSD on Azure there should be a better chance you'll pass. The issue is, even Azure's striped SSD setups are still many times slower than a single SSD drive which today are very inexpensive. If you're paying for Azure and getting less performance than a $100 piece of hardware, I find this to be a waste of resources, EXCEPT for the fact that the cloud type of hosting may be better for a company's situation as a result of the management side of the OS / Hardware, ISP and Power / Heating / Cooling / Redundancy guarantees. Because of this, I UNDERSTAND why people are doing it, but there has to be some sort of min spec because most server down tickets are focused on 'why doesn't the application work' ; Answering the question of "Why is the server not able to perform well on a single thread, but scales well with a high queue depth" is generally the answer to the problem: The storage is not optimized for the type of load the application calls for. There have been talks about standardizing a disk performance test for some time here at Connectwise's Automate division. To define min specs, we have to cut off the slower servers somewhere. Yes, I've seen systems showing great IO perform slowly on a EDF rebuild or other queries due to row count / Optimizations we may need to add. Yes, I've seen slow disks work for the intended purpose of serving MySQL queries quickly There are always outliers and scenarios where you may be able to get by, HOWEVER, in the worse case type of situation where a query is running slow, the IIS (WEB SERVER) can queue a bunch of connections (due to a deadlock on say computers while EDF rebuild runs), and those agent check-in requests filtering through the web server WILL stall out the GUI as the # of waiting connections must go into the 'queue' after 100 requests are added (and are waiting on MySQL). BECAUSE OF THIS.. the problem may not be obvious until a query runs that takes a while, and as you grow this can get worse and worse. This DiskSPD.exe Syntax pasted above is going to be the best defense we have against your server not working correctly long-term by simply finding a provider that can sustain the IO we suggest at minimum. Amazon's EC2 instances that use EBS Optimized volumes are going to run about 3200 IOPS on that test and so we've set a min of this spec as single 7200 RPM SATA disks are also posting these (or higher) numbers. It sounds like Microsoft is trying to convince people that the config they've chosen for running their platform and storage pools is 'best' and from your note on the Automate server requiring 3 SSDs JUST to sustain the performance required by our apps, while a single ~$100 Samsung outperforms the '3 ssd' stripe by 10 times over. The important part of this is: They are artificially throttling the IO and get to choose what a 'SSD' qualifies as. SO, while the application may work after 3 SSDs are combined, the results of the IOPS test we choose to run may still not be optimal for Automate's application calls, but it may 'work' to a point. Heck, for that ~$800 price why is one NOT building a server with hardware that benchmarks @ 33,000 IOPS (NVME SSD) for pennies compared ; sending that hardware to a datacenter, and calling it a day? --The min spec is NOT voodoo to reach, and if you are below 3000 IOPS, you have no excuses besides your own doing for going against experts who have direct experience with the application's design. Azure is simply not the best choice for a cost/performance on Automate. Can you get it to work? Maybe. How well? A little more time wasted troubleshooting may answer that once the load scales with company expansion. Be careful out there with what you buy into!
  9. That URL is below, and can also be a custom port if your agent templates are designed to change that port for check-in by edit of the 'server address' ; Here's the specific URL it's hitting by default, for agent ID 89: http://FQDN/LabTech/agent.aspx?89c5&10 The text and numbers after ? are specific encoded items we can determine, like what is it sending, how many items, etc We've seen cases where the WEB GUI can be hit, but a IPSEC policy can block agent communication through LTSVC.exe, so this would not guarantee communication, better try hitting that URL and look for the agent version returned to verify if the URL / ASP can be served (again not a guarantee event if that aspx site can be reached for agent.aspx ; Just a guideline. Better run a Wireshark instead to see if deep packet inspection is a culprit if troubleshooting communication issues..
  10. Yes, this is a Known limitation of Chrome 71 and the \Automate\ Web Control Center Our fix is to use Firefox for now until we can patch this limitation on Automate 12 Patch build .492+ Take this site as example of a working patch .489 server on Chrome 71: https://dfleisch.ddns.net/automate
  11. dfleisch

    Intel Processors

    Maybe doing this by Hardware ID then having a cross-reference for diff gens of Intel Procs? https://stackoverflow.com/questions/7480556/how-to-get-hardware-id-for-a-network-adapter-programmatically-in-c-sharp The Hardware ID via WMI query should be available to query with a remote monitor..
  12. dfleisch

    Speed optimization of On-Prem Automate

    If you'd like a sneak peak of what testing metrics Connectwise is employing for the future to help prevent configuration problems, wala: This diskspd.exe -c10G -t4 -si16K -b16K -d30 -L -o1 -w100 -D -Sh C:\testfile.dat command line syntax fully supports changing paths and or file test sizes / block size to find out what performance your disk provides at differing metrics. You can choose read or write as well with this Microsoft tool. ------ The goal is to suggest 3000 IOPS on the 'total' column shown below for the 100% write @ 16k metric as a 'minimum spec' we require for new builds. Any one who doubts performance or does not believe performance is a factor for issues on the server, we first need to ensure your current Automate Patch version. --If certain indexing has been added to the database to work-around known issues with patching queries and networkdevices tables queries which happen to also join the 'computers' table then, it should come as no surprise, once we rule out these factors, disk is import to review. Without these indexes (coming in Patch 11 and 12 of Automate 12) Agent check-ins and CC login can back up and spike IIS server / w3wp.exe CPU use along with MySQLd.exe's usage by way of held open connections eating resources and responsiveness ; ..Some are Known issues and are included in a current, future, or no patch (yet). In order to find out if this is the case, please contact Automate Support and get a ticket in to see if your server is optimized using all the known fixes we have through subject like 'Server Performance review' ------ To get the bigger picture, also run a 4k write IOPS test, in order to gain insight on the contrast between these two properties of the disk sub-system. Below, I've run two 16k write tests as the validation would do pre-server install. *You must edit the path of C:\testfile.dat to the drive you wish to test on no matter when diskspd.exe is located as the exe location wont change results (The test runs on the drive you specify via path in CMD line syntax) I've attached a benchmark of a direct connected (SATA 6G) 1TB 7200RPM 'Performance' Hitachi Deskstar spinning disk on the right. On the left, a Samsung 850 EVO 250GB is running the test virtualized. We need to make note of Total IOPS in these screenshots under the 'Total: ' I/Os per second' output of the test result screen via CMD.exe From these test results, the HDD scores ~3716 Operations Per Second in total when writing 16k blocks, while the SSD does 12,264 IOPS on a 16k Write Keep in mind, the testfile.dat size variations as modified by -c10G (default of 10GB test file) show little change to SSDs, while HDDs can be up to 30% faster on first vs last sector of the disk, so use this on new drives with a size closer to total disk space to see total drive IOPS on average, with expense of longer test run time. The test is going to give you an idea of if your disk meets our requirements for new server builds. Example: Any server scoring below '3000 IOPS' may cause EDF rebuild, search rebuild, or patching rebuild, nightly maintenance, or intermittent performance issues as varying queries run throughout the day. Some of these are traced to IO issues, other relating to a optimization we must apply through patches. This is directly linked with high avg CPU usage by the DB AND WEB server due to 'managing the idle time' between IO requests and transaction commits. Some of this is preventable, some of it we can re-write code to optimize, but get this out of the way first, run a benchmark! diskspd.exe
×