{"id":60,"date":"2015-02-27T18:50:21","date_gmt":"2015-02-27T18:50:21","guid":{"rendered":"http:\/\/svops.com\/blog\/?p=60"},"modified":"2015-04-05T23:09:32","modified_gmt":"2015-04-05T23:09:32","slug":"better-than-newrelic","status":"publish","type":"post","link":"http:\/\/svops.com\/blog\/better-than-newrelic\/","title":{"rendered":"Better than NewRelic"},"content":{"rendered":"<h1>Application Performance Logging<\/h1>\n<p>You can learn a lot about your code by instrumenting it. \u00a0You&#8217;ll find methods that are called more than you&#8217;re expecting, or be able to identify which part of your code is slowing the app down.<\/p>\n<p>There are Saas solutions for this, like New Relic. \u00a0From my experience, New Relic &#8220;runs&#8221; out of the box, and may provide information on common tools (MySQL, etc), but doesn&#8217;t provide any real information about <em>your<\/em> app without adding instrumentation calls to your code.<\/p>\n<p>Also, New Relic works on an <em>aggregated summary<\/em> of your application. \u00a0If it supplies details on a <em>specific request<\/em>, I never found it.<\/p>\n<p>If you have to instrument your code, and you want to know exactly what your users are doing, then what&#8217;s the appeal of New Relic?<\/p>\n<p>Recently, I was involved with a project to instrument some &#8220;under performing&#8221; python code. \u00a0New Relic had been installed, but had provided no value.<\/p>\n<p>Caveat: I was involved on the analysis side. \u00a0Another developer wrote the logging framework.<\/p>\n<p>A performance\u00a0class was created and we added a custom <a href=\"https:\/\/docs.djangoproject.com\/en\/1.7\/topics\/logging\/#topic-logging-parts-handlers\">django handler<\/a>\u00a0that initialize a performance object for the request.<\/p>\n<p>To instrument a method, you would wrap it in a <a href=\"https:\/\/docs.python.org\/2\/reference\/compound_stmts.html#the-with-statement\">&#8216;with&#8217; statement<\/a>:<\/p>\n<pre class=\"python\"><span class=\"pykeyword\">with<\/span> performanceTracking(\"block name\"):\r\n         some block of code<\/pre>\n<p>The &#8216;with&#8217; syntax provides an __enter__ and __exit__ hook, which was used in the performance class to start and stop a timer for that block of code. \u00a0On __exit__, logging information for this block was added to a data structure for the entire request.<\/p>\n<p>When the request finished, the handler would write the entire logging block using the standard logging mechanism. \u00a0Log these at DEBUG and it was easy to disable it in production.<\/p>\n<p>What you ended up with was a nested set of performance information:<\/p>\n<pre>{\r\n    \"requestid\": \"12345\",\r\n    \"name\": \"my web page\",\r\n    \"calls\": 1,\r\n    \"duration\": 200,\r\n    \"children\": [\r\n        {\r\n           \"calls\": 2,\r\n           \"name\": \"mysql user lookup\",\r\n           \"duration\": 190\r\n        },\r\n        {\r\n           \"calls\": 1,\r\n           \"name\": \"something else\",\r\n           \"duration\": 10\r\n        }\r\n    }\r\n}<\/pre>\n<p>You could now see that &#8220;mysql user lookup&#8221; was called twice (!), and was responsible for 95%\u00a0of time spent (!). \u00a0Even better, you knew this information for <em>this individual request<\/em>. \u00a0You can summarize the performance of these blocks across the system (ala New Relic), but you could also zoom in and see the details for this particular user. \u00a0Powerful stuff, right?<\/p>\n<p>With your own code, the possibilities are limitless. \u00a0In a web app environment, maybe the page is authenticated. \u00a0Add the user&#8217;s ID to the logging block, and now you can see <em>exactly what happened<\/em> for <em>Mary<\/em> on <em>Tuesday<\/em> at <em>10AM<\/em>. \u00a0If you run multiple applications or subsystems, add their identifiers to the logs and you can break performance out by component.<\/p>\n<p>Once the information was written to the logs, it was loaded into an <a href=\"http:\/\/www.elasticsearch.org\/\">Elasticsearch<\/a>\u00a0cluster for analysis.<\/p>\n<h1>Performance Analysis<\/h1>\n<p>With all of this data, the next step is to see what&#8217;s going on. \u00a0The ELK environment has come a long way since in the year since this project, so it would probably be even easier now!<\/p>\n<p>With the data properly loaded parsed in\u00a0logstash and fed into elasticsearch, you can make dashboards in kibana to show things like:<\/p>\n<ul>\n<li>total application performance over time (graph the top-level &#8220;duration&#8221; field).<\/li>\n<li>the performance of each code block over time<\/li>\n<li>the most-frequently called code blocks<\/li>\n<li>the worst-performing code blocks<\/li>\n<li>does performance degrade with load?<\/li>\n<\/ul>\n<p>We found it to be very useful to wrap every third-party API call with this framework. \u00a0If the vendor was slow, we&#8217;d be slow, but now we had our own data to isolate the cause and work with the vendor.<\/p>\n<p>Combine this information with other sources of data and you can determine neat stuff like:<\/p>\n<ul>\n<li>the overhead added by your web server (difference between total time and the applications total time).<\/li>\n<\/ul>\n<p>Doing a basic code review on the worst-performing block would usually make it very clear why the performance was suffering. \u00a0I was the largest creator of tickets for the dev group in the company.<\/p>\n<p>Using this information, we focused on two goals:<\/p>\n<ul>\n<li>making the lines flat, i.e. creating consistent, scalable performance.<\/li>\n<li>making the lines lower, i.e. increasing overall performance.<\/li>\n<\/ul>\n<p>We were very successful in both of these goals.<\/p>\n<p>We had several interesting &#8220;deep dives&#8221; into this data, but one of them stands out:<\/p>\n<p>The whole system was monitored with Gomez, which one of the executives liked. \u00a0Gomez started to report performance problems, but they weren&#8217;t seen in the summary data that we graphed.<\/p>\n<p>Since we had the details, I was able to find the exact request that Gomez had made (user name and time), and see the block of code that was slow.<\/p>\n<p>It turned out that the system was writing a row into a table every time a user logged in. \u00a0The Gomez test account logged in <em>a lot<\/em>, so there were a lot of rows for this user. \u00a0The application\u00a0pulled these rows, looking for some recent information (last search string, etc). \u00a0Unfortunately, the app pulled in <em>all<\/em> the rows, not the just recent ones that it needed.<\/p>\n<p>It was easy to find and easy to fix. \u00a0Management was happy.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Application Performance Logging You can learn a lot about your code by instrumenting it. \u00a0You&#8217;ll find methods that are called more than you&#8217;re expecting, or be able to identify which part of your code is slowing the app down. There &hellip; <a href=\"http:\/\/svops.com\/blog\/better-than-newrelic\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[11,3,5],"tags":[],"_links":{"self":[{"href":"http:\/\/svops.com\/blog\/wp-json\/wp\/v2\/posts\/60"}],"collection":[{"href":"http:\/\/svops.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/svops.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/svops.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/svops.com\/blog\/wp-json\/wp\/v2\/comments?post=60"}],"version-history":[{"count":1,"href":"http:\/\/svops.com\/blog\/wp-json\/wp\/v2\/posts\/60\/revisions"}],"predecessor-version":[{"id":61,"href":"http:\/\/svops.com\/blog\/wp-json\/wp\/v2\/posts\/60\/revisions\/61"}],"wp:attachment":[{"href":"http:\/\/svops.com\/blog\/wp-json\/wp\/v2\/media?parent=60"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/svops.com\/blog\/wp-json\/wp\/v2\/categories?post=60"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/svops.com\/blog\/wp-json\/wp\/v2\/tags?post=60"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}