Backing Up Data to Amazon S3

Tuesday 13 December 2016

I decided I wanted to backup my database somewhere other than my server. All of my code is in git so the only thing that could be lost in case of server errors is the database. To start I wrote a little shell script to dump the database using mysqldump. I wasn't sure where to put the SQL file to keep it off-server. My first thought was to put it in git, which was easy to do in the script. So I updated the script to add the file to git, commit the changes and push the repo up.

After a bit more thought I decided that might not be the best way to do it. It worked fine, but my usual workflow is I make all changes locally and then push it to git and then pull to production - I don't make any changes on the production server unless absolutely necessary. While adding files to git that don't exist in my dev environment shouldn't really cause any problems, I thought there must be a better way.

So I decided to put the dump file into an Amazon S3 bucket. Laravel can use S3 as a filesystem, as documented here, but I had tried to use this before and not had much luck. I saw that Amazon had a PHP package to interact with S3, which is SDKforPHP, so I thought I would try that out. After a little bit more digging I found that Amazon also has a package specifically for Laravel, which is located here. That turned out to be the winner. As opposed to trying to read pages of documentation for Laravel's file system or the Amazon SDK, all I need was a few lines of code and I was up and running. As a note, this package keeps the Amazon Keys and Secrets in the .env file, which is a lot better than keeping them in the filesystem.php config file like Laravel does. If you are going to use Laravel's S3 filesystem I suggest you update the filesystem.php file to pull them from the .env.

Now that I was able to upload files to S3 from a browser, the next step was to create an artisan command that I could add to my shell script. The Laravel documentation for this was clear and easy to follow. The only problem I had was a typo that for some reason didn't throw an error locally, but did on my production server. Other than that this is tested and working.

I had considered using S3 for this site in the past, but decided not to since I had problems with the Laravel S3 filesystem. Now that I've integrated with S3 so easily I may revisit that decision.

 

Labels: coding
No comments

Redis

Sunday 11 December 2016

I've been messing around with Redis for a little while now and I'm using it in a couple places on this site. The first thing I did was I started caching some DB queries that get performed a lot, like the main blog page, the blog archives menu and the list of recent posts on the home page. Laravel's Cache facade makes caching really easy, and you can switch between the default cache driver which caches to files and Redis without changing any of the code. For some pages I use the Laravel Cache facade, for others I use the Redis facade to cache directly to Redis just for some variety. In general it is probably much better to use the Cache facade than the Redis facade because if you want to switch to a different caching mechanism with one change to the .env file instead of having to rewrite all of the code.

I also use Redis to queue some tasks which don't need to be done synchronously, but that's another post.

I never used Memcache because I didn't like the fact that it's all stored in memory, so if the server goes down you would lose all of the data in it, but Redis persists data to disk by default so it provides the speed of keeping data in memory with a very low risk of losing the data. In my case, I store the data in the DB and cache it in Redis, but if I were to start from scratch (and had plenty of RAM on my server) I would probably keep a lot more data in Redis.

So, to summarize, I think Redis is awesome and I will definitely make more use of it in my stack in the future.

Labels: coding
No comments

Update on Pulling Data from Discogs

Thursday 01 December 2016

I initially ran into problems pulling the data from discogs because I was using a package that provided a Laravel implementation of cURL called Laracurl, and it didn't provide a header that discogs needed. So I made a change to the package and got it working. After someone advised me that Guzzle was a better package than cURL I switched to that, and in the process rewrote my matching code. After running the new, improved, streamlined code I have now matched all but 250 of my records to the discogs data, and 50 of those unmatched records are white labels which may not be matchable. So I am pretty happy with that ratio and will be moving on to my next project now.

Almost all of my records now have a link to discogs and a thumbnail pulled from discogs in the record detail page. If anyone wants to see additional info on the records such as tracklisting, year of release, or anything else, that info will be available on discogs. I didn't see any need to duplicate that data locally.

The code I wrote to automatically match my records to discogs did not match everything 100% accurately, and I tried to review the matches to make sure they were accurate, but I may have missed a couple here and there. 

Thanks to Discogs.com for making a great API. 

Labels: coding, music
No comments

Discogs.com is one of my favorite web sites. I discovered it 15 years ago, and have used it since to research records and music and such. I just recently discovered that they have an API, so I used it to search for each record in my collection, and if it found results, I imported the link to the discogs page as well as a link to the thumbnail images to my database.

The process was rather convoluted. To start I just did a search for the data as it was in my database - artist, title, label and catalog number. But discogs often has multiple entries for each record - maybe it was released in different countries, or re-released, or has different entries for promos, test pressings and white labels. So my starting algorithm was as follows:

  • Match the full data in my database to a search request to Discogs API.
  • If one and only one result is returned, take that one and link it.
  • If more than one result is returned, filter out the ones for promos, mispresses, white labels, etc.
  • If there is still just one result, use that one. Otherwise mark the number of results returned in the database and move onto the next record.

This matched a couple hundred out of the couple thousand records in my database. Most of my records got 0 matches to Discogs, some still had multiple matches - anywhere from 2 to 35. So I started reviewing the ones with multiples by hand, and I realized that for most of the records with under 5 matches the matches were pretty much equivalent. So for those I just took the first match and assigned it. This matched another couple hundred records.

Now I put aside the few remaining records with from 5 to 35 potential matches and focused on the thousand or so that had no matches. Reviewing some of them manually, I found that many of them were due to typos in my database. So my next step was to omit the artist field and just check the title, label and catalog number. I got another couple hundred matches using this method. Then I went on and just searched using the catalog number. This method matched about half of the remaining unmatched records - but I had to manually verify each match because some catalog numbers are not unique. 

Unfortunately I do not have a catalog number for every record in my collection, and as of now about 1/3 of the records in my database are still unmatched. For those that are matched, on the record information page you will now see a link to the discogs.com page for that record, as well as a thumbnail pulled from discogs.com if available. For anyone interested in collecting records I highly recommend discogs.com as it is by far the most comprehensive database of music releases I know of. 

Labels: personal, music
No comments

Democracy for Realists

Wednesday 02 November 2016

This election in the US has got me thinking a lot about democracy and how it works, or in this case, doesn't seem to work too well. I get the impression that people don't choose their candidates based on the candidate's policy positions matching their own, but the opposite - they choose their policies based on which candidate or political party they support. Well I just read this book, Democracy for Realists, by C. Achen and L. Bartels, which confirms my fears and goes far beyond that to totally demolish what they call the "folk theory of democracy" using statistics and facts.

What they refer to as the "folk theory of democracy" is basically what you are taught in school - that democracies are responsive to the will of the people and allow people to shape the policies and laws of the government; that the people decide what the government will do. By analyzing election results and other statistics, they take a number of theories about how democracies allow the people to express their will and test them, and find them all woefully lacking. It turns out that only one theory holds up, and that is that voters reward or punish their representatives based on the voters economic prosperity. But the voters are extremely myopic, only taking into account the few months prior to an election when casting their votes and disregarding the rest of the preceeding couple years.

The Founding Fathers of the US set up a representative democracy because they understood that the normal people wouldn't know enough about politics or policy to really make well-informed decisions. So instead of the people voting on the laws the people would elect representatives that they trust to vote on the laws. The representatives would devote their time to studying and debating the issues and would make well-informed decisions. However the Founding Fathers never anticipated the rise of political parties, which today are so firmly entrenched that most people don't even realize they were never part of the plan. 

The folk theory says that people will choose their party based on their political ideology or policy preferences, but in reality it is just as often the other way - people will develop their policy preference based on their partisan identity. The authors go beyond this to say that the party affiliation is mostly based on a person's "social identity" and has little to nothing to do with their political ideology. The way the book describes it people choose their party affiliation based on the kind of person they consider themselves to be and the kind of people they think belong to the political party. As far as I can tell this is basically a fancy way of saying "peer pressure" - if your family is Republican and your friends are Republican you are likely to be a Republican even if you disagree with Republican policies. In fact, people will often either change their ideology to match their party's, or convince themselves that their party's ideology is closer to their own than it actually is. 

Politics today has become so complex that it is nearly impossible for any normal working person to really understand or make well-informed decisions about all of the policies. In order to be able to handle issues this complex we need to simplify them greatly into mental models which unfortunately omit most of the detail and nuance. Instead of having to consider the myriad sides of an issue and the numerous approaches, we take the talking points that the political parties and the mass media give us and just accept and repeat them. It's a lot easier than having to gather massive amounts of information, sort through it, analyze it and come up with our own opinions. One theory is that political parties provide us with easy cues to figure out what our opinions would be if we had enough time and information for us to come up with them on our own, but this theory is also analyzed and largely debunked.

So if the results of elections have little to do with the policy positions of the candidates and the policy preferences of the voters, then what does drive the elections? Well it turns out it's largely random. Voters will reliably vote out the party in power if the economic wellbeing of the voter has decreased in the months before the election, and vote to keep the party in power if their economic wellbeing has increased just before the election. Voters will also vote out the party in power as a result of things beyond the power of any human to control like floods, droughts, and even shark attacks. But the policy preferences of voters really have little to no effect on elections, other than the fact that many people only develop their policy preferences based on adopting those of the party or candidate they support.

This isn't to say that democracy doesn't work at all, it just doesn't work in the way that it is supposed to work and the way I was taught that it works in school. Because politicians do have to be re-elected they must avoid the appearance of impropriety and appear as if they have the best interests of the people in mind. This at least prevents gross abuses that are typical in dictatorships. But as to whether the people really have much say in determing government policy, it would seem that the answer is no.

Personally I think that the party system in the US is a major factor in this. With only two parties dominating the government, they get their voters worked up about silly issues that aren't really all that important and then once they are in power they are largely indistinguishable, except that they keep their members constantly angry with the other party over these wedge issues which will never be addressed. The only people who really have a say in the government are the wealthy donors and corporations who fund the elections and pay the lobbyists. But that is a different book.

I'm sure this book will upset a lot of people because it challenges some basic assumptions people have about America and about democracy in America. People tend to accept facts that confirm the opinions they already have, and get upset when facts contradict their existing opinions. This book really makes you have to think about democracy and how it works and how it doesn't work. I think this is a book that everyone needs to read.

Labels: personal, politics
No comments

Archives