We have had solar panels on our house for almost three years. Our DC to AC inverter (which converts solar electrons to wall plug electrons) is connected to the "Internet" (it's new, you should check it out) and reports energy production information to a "website." Specifically, this one: SolarEdge Monitoring (there are demo links that anyone can look at without an account). The website collects and presents the data in a few ways, showing current and past production with graphs, and various totals that make a customer feel good about themselves:
Customers can download historical data off the website, which I have done and analyzed. The plot below shows how many Watt hours (Wh) the system has produced each of the last three years (check it out, the plot is interactive!). The system was turned on Feb 10, 2014, so it's no surprise that 2014 has the lowest total. Clearly, 2016 was a much better year than 2015, producing over 8% more energy.
(Mobile users may want to request the desktop version of this page to view the plots: Safari, Chrome)
As an aside, here is a histogram of the daily totals over the life of the system. The peak at 0 Wh is real - those are days when the panels are completely covered in snow. Such is solar panel life in Colorado!
The high total for 2016 got me thinking - just how good of a year was 2016? There are many ways to explore this, and I went with one that I'm familiar with: simulation. Essentially, what I did was for every day between Jan 1, 2016 and Dec 31, 2016, I randomly chose the Wh generated on that same calendar day from all my historical samples, and added them up. For example, if I am looking at July 9, 2016, I have three data points (2014, 2015, and 2016), and I choose one of them with equal chance, and add it to the total. I do this for every day of the year, and I do many thousands of simulated years. Out of this simulation I get a distribution of yearly totals:
The mean result is just a shade over 4,000 KWh. I've shown the one, two, and three-sigma regions with decreasing shades of grey. The 2016 total, 4,131 KWh, is well past one sigma, and close to the two sigma value. This indicates that 2016 was indeed a fairly good solar year!
How realistic is this simulation method? 2014 is missing all of January and part of February. In 2015 and 2016, January averaged about 220 KWh. If we take the mean value from simulations of 4,000 KWh, and subtract off 220 KWh, we get 3,780 KWh, which is almost exactly what was recorded for 2014. This is a good piece of evidence that the method might have some validity.
The simulations suggest that 2014 was a normal solar year, and that 2015 (at 3,817 KWh) was a particularly poor solar year. In fact, 2015 was a much worse solar year than 2016 was a good year according to the distributions of simulations.
I've put all the code here for your inspection. I didn't present it here, but in the code I explored a modification to the simulation method that added some "stickiness" to the choice of historical weather performance. My thought was that because day to day weather is highly correlated (tomorrow's weather is most likely similar to today's), enforcing some favoritism to stringing together days that actually were sequential might be more realistic. TL;DR: I get almost identical results at the expense of much slower simulations.
Some time ago I saw this blog post where Randal S. Olson used a genetic algorithm to compute an optimized path to visit various landmarks across the United States. I have used genetic algorithms professionally for a few things, and I found this application very fun and clever. As part of the blog post, Mr. Olson linked to the Jupyter notebook he used to calculate the optimized road trip. I am a heavy user of Python and the Jupyter notebook, so this intrigued me.
I got the idea to try to apply this kind thinking to the challenge to visiting the various state fairs that typically happen in the mid- to late-summer across the US. In fact, I got the idea to do this before the summer started. Indeed, some of the fairs analyzed below have already ended. But I have a good excuse! My second child was born very recently which has decreased my free time and increased the rate of brain cell death due to lack of sleep. These methods can be applied to future years by simply modifying the start/end dates where all the fairs are listed (in the notebook). Below are the results of my investigations.
The code behind these results can be found here. Please follow this link for a browser-friendly rendering of the Jupyter notebook.
We only attend fairs in whole-day chunks. In reality, it's possible to arrive in town and attend the fair on the same day. But to keep this analysis simpler, we'll assume that travel days and fair attendance days are separate.
We drive 12 hours a day. This is obviously something that comes down to personal preference, but I feel that 12 hours is fairly doable if you have more than one driver. Besides, who wants to attend all these fairs by themselves? What this rule means is that if it takes 12 hours and one second to get from point A to point B, the algorithm will count it as a two-day drive. This is obviously not realistic, but it keeps things simple.
When Do Fairs Happen?
Below is a Gantt chart that shows when the various fairs happen. The state name is on the y-axis, and time is on the x-axis. Except for Florida and Nevada, all the fairs happen in a fairly congested time frame between July and November. Please click here to see an interactive version of this chart.
Quick note: As can be seen above, the Florida and Nevada state fairs take (took) place far earlier in the year than all the other fairs. Attending them doesn't conflict with any other fair, so I leave them out of my analyses until the end, or I just don't include them.
How to Spend the Most Days at a State Fair
I'm first going to answer question of how to spend the most days at a state fair in 2016. This is the way to eat the most fried things, ride the most rides, and see the most hog races in one summer. The winning strategy here is to get to a state fair and not leave it until you have to (e.g. it ends, or it's advantageous to leave for another fair), and also spend the least amount of time on the road between fairs.
For this analysis, we will not use a genetic algorithm, but rather a directional graph (digraph). This kind of graph is a network of nodes and edges, where the edges link nodes in an allowed direction of travel. In this case, the nodes will be fair attendance days, and the edges will be transitions. Some of the transitions will simply return to the same fair for consecutive attendance days, and other transitions will be travel to a new fair that cover one or more actual days.
After a bunch of math and stuff (see the notebook!) we end up with 128 days that we can attend a fair. The diagram below describes the strategy of how to do this. Here's how to follow the diagram:
Because they are temporally isolated, Florida and Nevada are to be attended first. We attend each fair for their full duration.
Next we go to California. We see that there is an edge that goes back to California, and one that goes to Ohio. What this means is that we should stay in California as long as we can or want to, but we should travel directly to Ohio in time for that fair to be open.
Likewise for Ohio and Indiana, we should stay at those fairs as long as we can/want to, and then travel to the next fair while it is still running.
For Kentucky, after we have spent enough days there, we have an option of going to Minnesota or Nebraska. Whichever way we go here determines what fairs we can attend later on. For example, if we go to Nebraska, we can't go to New Mexico.
Follow the rest of the the tree until we end up in Louisiana.
Most Days on the Road
What if you really like driving, but only if you have a destination? Are you some kind of weirdo? Let's see how we can attend state fairs all summer, but spend most of our time driving.
Doing some analysis similar to above, we come up with a method that puts us at state fairs for only 43 days. Note that I'm not including Florida nor Nevada in this count. This method has added 69 extra days of driving!
The strategy diagram below shows how to maximize driving time. It is quite a bit more complicated than the one above. It shows (as might be expected) that the optimal strategy here is to travel between distant fairs as much as possible, even going back and forth between fairs that are open at the same time. If you can travel to another fair far away, do it! For example, near the top there are links between Delaware and North Dakota going both ways. This means that to maximize traveling time, we should go to the Delaware fair, then North Dakota, and then return to Delaware again. After that we may travel back to North Dakota (if it's still open), or head over to Montana or Maine. Inspecting the diagram we'll notice some fair pairs that are very distantly separated (OR<->AK, WA<->TN), which again, makes sense if we are trying to maximize driving time. Also notice that out of the many possible paths, the WY->RI->AK segment is always the best strategy, which is hardly surprising, considering how long of a trip just those two segments are.
Whether or not this kind of strategy is a good idea is a very good question. But here it is. I dare anyone to try it!
Most Unique Fairs in One Summer (with the Lowest Driving Time)
This is the question of how to attend the most number of unique fairs in a single summer. What if we want to sample the character of as many fairs as possible? Which state fair has the best fried food? How might we go about that?
The work by Mr. Olson that inspired me to play around with state fairs used a genetic algorithm to determine a fairly efficient way to visit landmarks across the US. After much thought, I decided that a genetic algorithm would not work (easily) for this question. The reason is that the algorithm does various mutations, substitutions, and combinations that would be difficult to implement when time-ordered events limit choices.
For example, let's say we have a short segment in our path:
and the genetic algorithm randomly decides to modify the Delaware step to a new fair. It can't just pick any fair because the new fair has to be open around the time the California fair. The new fair also has to be located such that we can travel to Ohio afterwards. Chances are that most new fair modifications wouldn't work at all.
Maybe the algorithm could replace the Delaware fair with a fair that works for California, but not Ohio. Then randomly picks the next fair (or fairs) replacing Ohio (and ones after that), fixing the chain until there's no more conflict. To me, that is not much different from just brute-forcing the problem because this would effectively destroy any genomes in the individual, and it would likely reduce its fitness.
This problem is probably as hard as the Traveling Salesman Problem. There are some differences between our problem here and the TSP, but my gut tells me that the two problems are similarly difficult:
We do not need to visit all the possible states. For example, it's reasonable to expect that visiting Alaska will be very unlikely in any route that optimizes the number of unique fairs visited due to its prohibitive travel time.
It's okay to visit a state fair more than once, most likely on consecutive days, but there's no rule that prevents us from coming back later if there are no other fairs we haven't already visited. This is a good thing, it means more fried food!
Fairs are strictly ordered, which limits our available choices at any given time. It also requires us to visit a fair at a certain time which might limit some other options we might have pursued further down the line.
As daunting as this all sounds, a partially-random brute force method is easy to implement. Running it for five minutes gives us this recommended schedule below (in "State + Date" format) that results in visiting 38 state fairs (or 40 if we throw in Florida and Nevada). There's no guarantee that 40 states is the most that we can travel to in one calendar year, only a comprehensive search can prove that (which is a very, very expensive problem to solve). However, considering the size of our nation, and that there are two fairs that are impractical to attend (Alaska and Hawaii), and one that doesn't exist (Pennsylvania), 40/47 isn't bad at all.