Shortdark Web Development

Case Study: Composer Package

Development 30th Aug 2021. Time to read: 11 mins

PHPAPISVGComposer

This is a look at a composer package that returns an SVG graph from an array of data: shortdark/socket composer package. There may be better ways to display graphs on websites with existing open source code, such as javascript libraries. I show the history of myself working with this code, and look at some development changes that happened in updating the code to work as a composer package.

History

I wrote some code that showed the monthly stats for one of my websites at the time, the stats were concerned with how often I updated the website. The line shows monthly updates and the bars show the average for each year...

One of the first implementations of a SVG graph from 2015

Quite soon after this I separated the database from the graph. To begin with I believe the database parts were removed from the code and an array was passed in. The data did not change very often, so one version used cron to make XML files once a day then the XML was used to make the graph. Other versions used JSON. Here is a multi-line version of the graph...

Multi-line SVG graph

At the same time as making the line graphs, I also made pie charts, bar charts and world heat maps.

Another use for the line graph code was to display sterling against other currencies. There are two Y-axes, CNY uses the red axis on the right and USD/EUR using the black axis on the left...

Currency graph

The currency graph used javascript to fit the graph to the size of your mobile device. The SVG itself is not mobile-friendly so altering the height and width of the graph to match the screen size was a neat way to make it work on mobile devices.

Up to this point each graph was similar but different. Things like combining line and bar charts, and changing axis for different data made each graph quite specific to its use case.

When I made a WordPress plugin it also used statistics and graphs. This time the WordPress plugin showed a specific metric, but it was general enough to be used on any WordPress blog. The plugin can show a line graph for tags, categories and any custom taxonomy the person has added...

Post Volume Stats screenshot

While you can make the case that another pre-existing library should be used instead of creating one yourself. But, for a WordPress plugin, not having any dependencies worked really well.

The WordPress plugin, Post Volume Stats also uses pie charts and bar charts and is still available on the WordPress plugin repo.

Putting the Code Into A Composer Package

Starting to keep the graph functionality separate from the database made it very straightforward to put the code into a composer package.

The shortdark/socket composer package was originally copy-pasted from somewhere it was being used with some slight tweaking to make it work. With the code being old and because it was originally being written for a specific task it had bugs and needed some improving.

In order to show it working, I've used stock market data. You can select one or more stocks and pass them into the graph in an array. You are able to look at one stock (with 50 day and 200 day moving averages), or you can compare up to 10 stocks on one graph.

Shortdark/socket example graph

You can change the start date of the data as far back as I have the data collected. The end date will always be the previous weekday unless the stock(s) being shown have ceased trading.

Usage

Three stages...

  1. Get the data from the API, check it and store in a database once, 5 days a week.
  2. Process the data and put it into an array.
  3. Use the array to make the graph.

1. The API

Apart from my time, two things cost money with this project: the server and the API. I wanted to only make an API call when I needed to. The data does not change over the weekend, so I do not make any API calls for the weekend days. Then there are several other checks the script goes through before we even hit the API. The list of stocks only includes active stocks that has a start date and not an end date, then it checks that we do not have the data already. So, we only hit the API when we're pretty sure we need to.

Now, that we're calculating moving averages for the stocks ourselves, when we collect the data we update the moving averages. Calculating two averages per stock should not create too much of a load, but even so I do not want to go attempting a calculation if it is not needed. Before I attempt to create an average I make some more checks so that I'm not wasting computing power.

An anomaly concerned with getting the data from the API was Reckitt Benckiser changing it's name 1. The issue here was that it also changed its Tradable Instrument Display Mnemonic ("TIDM"), or ticker, from "RB.L" to "RKT.L". Unfortunately, on the day the name changed the old ticker stopped working and the new ticker began. The API call identifies the stock from its ticker, so we had a problem. The two tickers are set up as two different stocks with the correct start and end dates. Once we have all the data the old ticker is made inactive, so it will not appear in the list of stocks to update. I'm sure this will happen again, so it's good to know there's a quick fix.

2. Processing Data

The data is just taken from the API and all the graph is doing is taking an array and displaying it. The hardest part of the project is processing the data into an array that will give us the graph we want. The pain we have gone to in only calling the API the minimum number of times should mean that the data is perfect, but it may not be. Whatever data we've got on the database we need to get the composer package an array it can make a graph out of, i.e. we have to present the array in the correct format for the graph to read it.

3. The Graph

Creating the graph with a composer package ensures the final stage of the process is completely separate. It's on a separate GIT repo. Being able to modify both repos mimics a larger website where there may be different parts of the website may be completely separate to others.

The graph class itself is pretty simple, and I've tried to break it down into methods that are fairly small and well-named. I also wanted it to be as customizable as possible, so instead of hard-coding anything we are able to change the settings when we call the class. Other than the user defined settings there are some rules concerning the data.

To make the graph the same size as whichever device is viewing the content I use javascript to pass the height and width values into the graph.

Development Discussion and Anomalies

There were four anomalies that arose during developing the package...

  • Stock Splits
  • New stocks getting listed.
  • Public holidays and missing data.
  • Public companies getting acquired and ceasing to be publicly traded.

I'm going to discuss these anomalies and show with some screenshots to show how some bugs were fixed.

Most of the issues arose because the code was copy-pasted from old code and had a specific use case. It really didn't need a huge amount of untangling, there were a couple of places that mainly needed simplifying.

Stock Splits

The NVDA stock had a 4:1 stock split on 2021/07/20 2. Below, is how this affected the data we were collecting from the API.

A stock split causes multiple problems

  • On the date of the stock split there is a dramatic drop in price.
  • After the stock split the API adjusts the historic data to be equivalent to that after the split.

The easy way to fix this was to update all the stock data to match the new post-split data. Then the chart looked less crazy and showed the change in stock price more accurately.

Correcting the data fixes the graph but some data is lost

The 50 day and 200day moving average data is missing in the new data. The moving averages came from the API and were not calculated.

Calculating the averages on-the-fly on every page load would be too much workload. Creating and storing the averages in the database would ensure that the graph would not load any slower. Which I eventually did.

Another change in this version is that I have made the week numbers across the top optional. As no-one is using this composer package but me I made the default not to show them. Ideally when making some functionality optional the default would be to make the previous functionality the default, so that people who are using the code do not update composer and wonder why something is missing.

Correcting the data fixes the graph but some data is lost

I have seen a few different websites all give different values for 50 and 200 day moving averages. I'm not sure if there is one universally-recognised way to calculate them. My way is different to the previous values, but I have seen different versions that are different again.

How to deal with stock splits in future

There is an API that gives stock split information. If I trust the API I could deal with stock splits automatically in the future.

While I don't mind hitting an API once a day for a test project like this, I do not want the possibility of an API causing this project to do a lot of unecessary work without my knowledge. The safest thing would be for the API to flag up a possible stock split, then I could manually approve it.

New Listings

New listings have no data before the IPO date. That just means that the earliest the single graph can show is the IPO date.

The graph composer package simply displays the data you send to it. It does not make decisions on what to do with different data.

When we display the data for a single stock the graph will only show the dates it receives in the array. When we're comparing a stock that has recently been launched to other older stocks, we need to make a decision which data we want to put in the array we send to the graph.

From a previous version of the package we have the Paypal and Square data from the start of the year, so we can compare the two like this...

Comparing two stocks which both have all the data

But, if we want to compare Paypal and Square to the newly public Coinbase, we have a choice to make. Either we show the two lines in full and only begin the third line when it starts trading, or we simply start the graph when all three stocks are trading. I have decided that we have to start at the IPO date or later. This is controlled by the array we give the graph, if we wanted to display different data we change the array.

When comparing the same stocks to a newly public company the first date must be the IPO date

At the time I took these screenshots, the graph was able to squash the graph so that the full range of data it receives is displayed, but it did not stretch so that shorter date ranges fill the whole graph. I have now changed this, but at the time the issue was that there was a bug that meant the graph lines were crossing the Y-axis.

Public Holidays and Missing Data

Originally gaps in the data caused problems in the graph. This was due to the original implementation not expecting any gaps at all. This new composer package was therefore not processing the missing data correctly so the graph was not getting displayed correctly.

We need each line to be continuous without gaps so filling in the public holiday data with the data from the previous working day. This was a simple and cheap solution because I did not need to make any API calls. Once the public holiday days had data the arrays going into the graphs did not have any gaps and so the graphs looked normal. This was a one-off script.

The next step was to modify the graph code to allow gaps in the graph lines. This means that if the API stops sending data for a period of time the overall graph will still be able to be displayed as normally as possible. Now, we are able to have graph lines that can be broken (start-stop-start-stop).

The graph is able to deal with missing data at the start of, or anywhere in the, graph line

Acquisitions

The day after Nvidia's stock split, on July 21st, 2021, Salesforce completed the acquisition of Slack 3.

Acquisitions or de-listings should be similar to new listings, and they are basically the same. As long as the data going into the graph is correct it should present a graph that stops on the final closing date. The value of the stock after acquisition can't be zero because then on the percentage graph you would get something like this...

Graph after an acquisition

So, the solution is to make sure the value after acquisition is null. Then, if the first (most recent) data point is null the graph must know how to deal with this.

In the same way that public holidays were not correctly dealt with if the data was missing, the graph package did not deal with the first data point being null very well either. Changing the data in the array from zero to null, and fixing the graph to deal with null values better meant that the graph looked less crazy...

Fixed missing data points

Missing data points (null values) are not displayed but all the data points that are supplied get displayed.

Future Updates

My TODO list for this is mainly around calculating the widths of the branding and legends boxes automatically, without the user having to specify them. Other than that the composer package seems usable at the moment.

Summary

Making an SVG with a PHP composer package may not be the best thing to do for all use cases. I've talked about this SVG graph composer package with some history to show where the idea came from.

The issues that arose when using the package initially were sometimes down to the data, the processing of the data or the graph. We can't always control the quality of the data from the API, so by making the data as complete as possible when we collect it, then understanding what data may/may not exist we can create a graph that shows what we want to show. Many of the bugs were solved by simplifying and modernizing the code. These changes also make any future changes much easier.

References


Previous: Learning a new Coding Language