Polars scan_csv and sink_parquet
The documentation for polars is not the best,
and figuring out how to do this below took me over an hour.
Here's how to read in a headerless csv file into a
LazyFrame using
scan_csv
and write it to a parquet file using
sink_parquet.
The key is to use with_column_names
and
schema_overrides
.
Despite what the documentation says, using schema
doesn't work as you
might imagine and sink_parquet
returns with a cryptic error about
the dataframe missing column a
.
This is just a simplified version of what I actually am trying to do, but that's the best way to drill down to the issue. Maybe the search engines will find this and save someone else an hour of frustration.
import numpy as np
import polars as pl
df = pl.DataFrame(
{"a": [str(i) for i in np.arange(10)], "b": np.random.random(10)},
)
df.write_csv("/tmp/stuff.csv", include_header=False)
lf = pl.scan_csv(
"/tmp/stuff.csv",
has_header=False,
schema_overrides={
"a": pl.String,
"b": pl.Float64,
},
with_column_names=lambda _: ["a", "b"],
)
lf.sink_parquet("/tmp/stuff.parquet")
John Michael Montgomery - John Michael Montgomery
When I saw the #6 album for this week, I said "I've never heard of John Michael Montgomery," but that's not true. Last year I reviewed All-4-One which included a cover of I Swear that was written and originally performed by John Michael Montgomery. I mentioned this in my review, but 10 months later his name did not remain embedded in my memory.
I Swear isn't on this eponymous album, but I Can Love You Like That is, which was also covered by All-4-One to great success. Playing this album I Can Love You Like That seemed familiar, but I'm not sure if it was the Country or R&B version I have heard before. All-4-One has many thanks to give John Michael Montgomery!
Outside of the All-4-One/John Michael Montgomery connection, I find nothing interesting about this album. It's not bad, but it's nothing special. The first line of the first song includes the words "I drive a pickup truck." This album was not trying to break new ground. If you're into this kind of Country music, I'm sure it delivers. I'm not, so it doesn't for me, and I can't recommend this album.
more ...Ol' Dirty Bastard - Return to the 36 Chambers: The Dirty Version
There were no new albums in the top ten last week, but this week another new hip-hop album debuts in the top ten, this time at #7. Return to the 36 Chambers: The Dirty Version by Ol' Dirty Bastard was his debut solo record. He was a member of Wu-Tang Clan from its founding until his death in 2004.
This review is basically a repeat of my last review. Like 2Pac, I was never into Ol' Dirty Bastard (nor Wu-Tang), and listening to this album hasn't changed my mind thirty years later. I acknowledge that Wu-Tang and Ol' Dirty Bastard are important to the history of hip-hop, and that alone means they're worth listening to at least once, but I'm just not into them.
more ...Claude Code
A few weeks ago I started playing with Claude Code, which is an AI tool that can help build software projects for/with you. Like ChatGPT, you interact with the AI conversationally, using whole sentences. You can tell it what programming language and which software packages to use, and what you want the new program to do.
I started by asking it to build a mortgage calculator using Python and Dash. Python is one of the most popular programming languages and Dash is an open source Python package that combines Flask, a tool to build websites using Python, and Plotly, a tool that builds high-quality interactive web plots. The killer feature of Dash is that it handles all the nasty and tedious parts of an interactive webpage (meaning Javascript, eeeek!). It deals with webpage button clicks and form submissions for you, and you, the coder, can write things in lovely Python.
With a fair amount of back and forth Claude Code built this advanced mortgage calculator, which is only sorta kinda functional. It does a fair amount of what I asked it to do, but it also doesn't do a fair amount of what I asked it to do, and it has a decent number of bugs. The good things it did was handle some of the tedious boiler plate stuff like creating the necessary directory hierarchy and files, including a README.md detailing how to run the software. Creating a Dash webpage requires writing Python function(s) and a template detailing how to insert the output of the function(s) into a webpage, and Claude Code handled that with aplomb.
What it didn't handle well was more complicated things, like asking it to write an optimizer for funding/paying off a mortgage considering various funding sources and economic factors. It also didn't write the code using standard Python practices. The first time I looked at the code using VS Code and Ruff, Ruff reported over 1,000 style violations. If I found a bug in the code, some of the time I could tell Claude to fix it, and it would, but other times, it would simply fail. Altogether there's about 6,000 lines of code, and as the project got bigger it was clear that Claude was struggling. Simply put, there's a limit of the size or complexity of a codebase that Claude can handle. Humans are not going to be replaced, yet.
At this point I'm not sure what I'll do with the financial calculator. I don't think Claude can help me any more, so I'd have to manually dive in to fix bugs and improve it, and I haven't decided if I will. In summary, my impression of Claude is that it's decent at creating the base of a project or application, but then anything truly creative and complicated is beyond what it can do.
Claude Code isn't free (they give you $5 to start) and I had to deposit money to make this tool. I still have a bit of money to spend, so let's try a few tasks and see how Claude does. I've uploaded all of the code generated below to this repository.
Simple Calculator
I asked it to create a simple web-based calculator, and at a cost of $0.17, here is what it created that works well (that's not a picture below, try clicking on the buttons!).
Simple C Hello World!
Create a template for a program in C. Include a makefile and a main.c file with a main function that prints a simple "hello world!". $0.08 later it performed this simple task flawlessly.
Excel Telescope Angular Resolution
Create an Excel file that can calculate the angular resolution of a reflector telescope as a function of A) the diameter of the primary mirror and B) the altitude of the telescope. It went off and thought for a bit, and spent another $0.17 of my money. It output a file with a ".xlsx" extension, but the file can't be opened. Looking at the output of Claude, I suspect that this may be a file formatting issue because Claude is designed to handle text file output rather than binary.
Text to Speech
Seeing that Claude struggled with creating binary output, next I asked it to create a Jupyter notebook (Jupyter notebooks have the extension .ipynb but they're actually text files) that uses Python and any required packages and can take a block of text and use text to speech to output the text to a sound file. This succeeded ($0.12), and in particular used gTTS (Google Text To Speech) to do the heavy lifting.
Rust Parallel Pi Calculator
Write a program in Rust that uses Monte Carlo techniques to calculate pi. Use multithreading. The input to the program should be the desired number of digits of pi, and the output is pi to that many digits. $0.11 later, I got a Rust program that crashes with a "attempt to multiply with overflow" error. Not great! I could interact with Claude and ask it to try to fix the error, but I haven't.
Baseball Scoresheet Web App
Create a web app for keeping score of a baseball game. Make the page resemble a baseball scoresheet. Use a SQLite database file to store the data. Make the page responsive such that each scoring action is saved immediately to the database. At a cost of $0.93, it output almost 2,000 lines of code. The resulting npm web app doesn't work. Upon initial page load it asks the user to enter player names, numbers, and positions, but no matter what you do, you cannot get past that. Bummer!
Random Number Generator
It looks like the more complicated the ask gets, the worse Claude gets. Here's one more thing I'll try that I have no idea how to do myself. We'll see how well Claude does at it. Create a Mac OS program that generates a truly random floating point number. It should not use a pseudo-random number generator. It should capture random input from the user as the source of randomness. It should give the user the option of typing random keyboard keys, or mouse movements, or making noises captured by the microphone. Please use the swift programming language. Create a Xcode compatible development stack. Create a stylish GUI that looks like a first-class Mac OS program. $0.81 later, and asking it to fix one bug, led to a second try that was also broken with 10 bugs. Clearly, I've pushed Claude past its breaking point.
I'm guessing that all of the broken code can be fixed, maybe by Claude itself, but it ultimately might require human intervention in some cases. I'm optimistic that it could fix the Rust/pi bug, but I'm not optimistic that it can fix the baseball nor random number generator stuff. AI code generation might be coming for us eventually, but not today.
more ...2Pac - Me Against the World
There were no new albums in the top ten last week, but this week a new entrant rockets to #1. I was never that into 2Pac, and listening to this album this week hasn't changed my mind. I acknowledge that he was one of the best at his craft, and I like some of his contemporaries in hip-hop, but for whatever reason 2Pac just doesn't do it for me.
Considering his historical importance it's probably worth your time familiarizing yourself with 2Pac, but I personally don't endorse his music.
more ...Kid Arts
Here's a bunch of various kid art for March 2025. It appears that a few browsers on MacOS, including Chrome, Opera, and Vivaldi, have issues with the black and white avif images below. Firefox and Safari appear to work just fine. Enjoy!








































































Bruce Springsteen - Greatest Hits
Last week there was no new album in the top ten, but this week a new arrival shot straight to the top. Released in 1995, this was the first "Greatest Hits" album by Bruce Springsteen. In the thirty years since quite a few more have been released.
Springsteen certainly wasn't at peak popularity in 1995, his most recent albums released in 1992 (Human Touch and Lucky Town) didn't match the success of his earlier work. His contribution of the song Streets of Philadelphia to the film of the same name in 1993 definitely helped keep him relevant. A decade removed from his best and most popular album was probably a good time to revisit his twenty years of music. Many critics dislike compilation/best of albums, but they are obviously very popular with consumers.
Springsteen is currently my #1 most listened to artist on last.fm. I am obviously biased in favor of his music. My favorite Springsteen compilation album is probably The Promise, but this first Greatest Hits album is a good collection of the best of his first two decades of songs.
Thinking back I wasn't into Springsteen in 1995 as much as I am now. When I was a teenager his kind of rock music wasn't in fashion and it took me until the wisdom of adulthood to discover it.
I don't necessarily recommend this exact compilation of Springsteen, but I do recommend listening to his library. A more recent Greatest Hits from 2024 is another good choice, especially in the 31 song digital version. You should check it out!
more ...Live - Throwing Copper
According to the chart, Throwing Copper by Live had been out for almost a year (43 weeks) before it cracked the top ten. The good news for Live was that their trajectory continued upwards all the way to the top. According to wikipedia the album took 52 weeks to reach #1 in May, 1995.
I seem to remember some rumblings when this album was popular that Live was secretly a Christian rock band that was crossing over to the mainstream, and that some of the songs like Lightning Crashes were anti-abortion songs. Reading the wikipedia page for Lightning Crashes, it seems that some of that confusion might have come from the music video. In truth the song was dedicated to a high school friend of the lead singer who was killed by a drunk driver.
My opinion of Live has perhaps been tainted by these rumblings and I never got into deeply their music. My listening count for Live is small and a few of their contemporaries have quite a bit more plays from me, like Collective Soul, Bush, and Counting Crows. Listening now I think the music is just a bit too corny and just a bit over the top for me.
My recommendation is that Throwing Copper is worth a stream, but it's not spectacular. It is very 90s, and it is fun to re-discover some of Live's music from thirty years ago.
more ...RadioX Streamer
Almost a year ago I linked to Finale which is an iOS app that listens to the music you're playing and auto-scrobbles to last.fm using song recognition methods similar to Shazam or SoundHound. It works decently well, but it requires two devices: your iOS device (which can't be playing music), and something playing music over speakers. With these limitations its utility is somewhat stunted.
Recently I discovered RadioX 5, which is a Mac OS internet radio streaming application that includes the ability to scrobble plays to last.fm. It has many other features but this one is key for me. It lives in the menubar so it's very unobtrusive. The one disappointment is that it doesn't automatically scrobble songs — you need to manually hit the red circular last.fm icon in the app to save the play. At first this seems like a a huge omission, but without some external way of knowing what is a song and isn't, the application would end up scrobbling non-songs. For example, below is what the application shows when the DJ is speaking between songs on The Colorado Sound (I don't know why the cover art is for Lynyrd Skynyrd, it wasn't played before or after the DJ interlude). I wouldn't want to scrobble the non-song "The Colorado Sound" by the non-artist "The Colorado Sound."
Short of access to a comprehensive database of global artists+songs titles, which if it exists I'll bet wouldn't be cheap for programmatic use, I can think of one way that might allow for auto-scrobbling. If the application had a way to prevent scrobbling for a particular artist+song combination set by the user, probably custom to each radio station, then auto-scrobbling everything else might be practical.
more ...Hootie & the Blowfish - Cracked Rear View
In a three months (thirty years ago) Cracked Rear View by Hootie & the Blowfish will hit #1, but this week it is only at #7. Looking at the last.fm page for Hootie & the Blowfish the top comment says "Damn...their one album sold more copies than they have plays on Last.fm" which equally true and revealing. As of this writing, they have 7.6 million scrobbles on last.fm, while Cracked Rear View has sold over 20 million copies. Working the math, this means that if all of Hootie's scrobbles came from playing a copy of Cracked Rear View just once, only 3% of those albums sold have been played while last.fm has existed. Clearly, Hootie's popularity is not what it once was.
Of course, this is not how scrobbling works; last.fm didn't exist when the album was released and only a fraction of music listeners bother to scrobble their plays. Also, I don't want to emphasize scrobble counts too much: The Beatles have 842 million scrobbles and Taylor Swift has 2.86 billion. I would hope that even Taylor Swift would admit that The Beatles are far and away more important, consequential, and everlasting (and honestly better) than she is. The only real thing we can read from Hootie's scrobble count is that they have fallen out of the current musical zeitgeist.
Like most last.fm users, my listening history for Hootie is pretty thin. I had only 17 plays prior to listening to this album, which averages to less than one play per year. I remember being aware of Hootie when it came out and hearing the songs on the radio, but they were not what I purposely listened to. Thirty years later, I don't think my opinions have changed all that much. Hootie is fine, but doesn't move me nor does it grab my attention. This album gets a big "meh" from me, I recommend that you listen to it, or don't, whatever, I don't care.
more ...Too $hort - Cocktails
This week we drop down to the #7 album Cocktails by Too $hort. Before listening to this album I had exactly zero Too $hort plays recorded on last.fm. I recall him being somewhat popular thirty-ish years ago, but clearly I haven't thought about him much since then.
Thirty years later, I can't say that his music has grown on me. It's quite misogynistic and explicit about how he treats women. Don't get me wrong, I don't subscribe to moral panics and music. I don't think that just because I don't like this music it should be banned, censored, or otherwise. A lot of music tells stories, which includes boasting, and there are different ways to do that in different musical genres. I just don't dig his way of telling stories, and his stories don't interest me.
My recommendation is to skip listening to this album. Instead, why don't you spin up Wildflowers by Tom Petty? It peaked at #8 in December 1994, and dropped down since then. I didn't review it because other un-reviewed albums were higher during the holidays, which was a shame. Wildflowers is a far, far better album than Cocktails, and it deserves your attention. Links to various streaming services are below 👇🏻.
more ...Van Halen - Balance
Debuting at #1, Balance by Van Halen displaced the un-reviewed Garth Brooks album from the top spot (fret not Garth fans, it returns to #1 next week).
I like much of Van Halen, but this album isn't as good as their earlier work. Albums like Van Halen, 1984, and 5150 all have what made Van Halen great, and this album lacks. Van Halen was about big, energetic songs, and Balance doesn't have those. Eddie's guitar playing is, of course, excellent, but the songs don't grab you like, say, Runnin' with the Devil does.
My recommendation is to listen to the album, but don't expect what you would get out of their earlier albums.
more ...Boulder NIMBYs
On the Not In My Back Yard to Yes In My Backyard spectrum, I am definitely on the YIMBY end. I firmly believe that one of the most corrosive elements of modern American society is how expensive housing has become. It has vastly outpaced inflation and has locked a huge swath of Americans out of the ability to afford rent or buy a home where they want to live. It makes moving for whatever reason (job, schooling, family, etc) much harder. I believe that it is one of the biggest contributors to political disfunction and the rise of extreme political division.
I live in Boulder, Colorado, which has a long history of open space preservation. In order to prevent increasing sprawl and encroachment of the iconic Boulder flatirons, in 1959 the city passed the "Blue Line" ordinance that prevented city water from being delivered above a set altitude. This made development above that line much more difficult. In addition, the city and county have made large land acquisitions over the years that has basically encircled the city of Boulder in publicly owned open space.
I want to stress that there's a difference between NIMBYism and preservation. I think one of the best things about Boulder is how close outdoor space is. Starting from my front door on a bike I can be on rural country roads or mountain bike trails in five to ten minutes. But preserving open space doesn't mean that already developed space can't change or get more dense. This is where environmentalism can descend into NIMBYism.
The problem is that choosing to preserve open space and preventing increased density means that any new development has to happen somewhere else. Of course, that's the hope of a true NIMBY, but it's selfish, shortsighted, and inconsistent with true environmentalism. Preventing more density in an existing neighborhood means that new development requires converting existing undeveloped open space into developed land. This means more roads, more people living farther away from destinations, more pollution, and less natural open space.
For a long time the Boulder city council was dominated by members of PLAN-Boulder, the group largely responsible for the Blue Line and other environmental efforts in the last few decades. However, in recent elections more development friendly people have been elected, displacing PLAN-Boulder endorsed council members. I think the most obvious explanation is that as more and more people in the community are finding themselves priced out housing in Boulder, the status quo, which is what PLAN-Boulder represents, is less attractive to voters.
Regrettably, and I often contemplate deleting my account, I am on Nextdoor, which is Twitter for old people. The Boulder Nextdoor is a NIMBY echo chamber. Almost daily there is a post from someone lamenting a new building, or something that's been gone 25 years. Posts complaining about the current council have been frequent lately, and on one of them they linked to a new group: Boulder Action. I was curious, and I clicked on the link. Now, dear reader, you get to travel with me into the mind of a Boulder NIMBY.
On the homepage (see the bottom of this post for a screenshot) they have an info box about Density, Growth and Housing. Let's look at each bullet point:
- Our city just feels too crowded. Not to belabor the point, but one person's too crowded is another's too empty. This is almost meaningless and impossible to measure. It's vibes. More specifically, how do they propose making it "feel" less crowded: depopulation? The only American cities I can think of that depopulated did it for negative reasons such as loss of jobs, or major pollution events. Is that what they want?
- It seems more dense, over-built and less pedestrian-friendly. Vibes! Whoever wrote this copy needed an editor. Two of these topics can be measured (there's no "seems" at all): density and pedestrian-friendliness. Measuring density is obvious, but for pedestrians, I'd think that this can be measured by ability access to destinations by foot and the danger of being injured as a pedestrian. I think that more density correlates with pedestrian friendliness, but it would seem that they don't, which is a tell I'll return to later on
- Small, local businesses are struggling. This may be true, but this needs data compared to local and national trends. Is it any more true now that it was before? Can it be explained entirely by the city council encouraging development? Without data, this is just a vibe.
- Although a lot of housing is being built, much of it is not affordable and not suitable for families. I'd take issue with "a lot of housing," but it probably feels that way to NIMBYs. The cost of the new housing is definitely higher than most would like, but some of the solutions to make building housing cheaper (removing single-family zoning, easier permitting process, no parking minimums, higher density limits, etc) are anathemas to NIMBYS. Their preferred policies directly contribute to the high cost of building housing. Furthermore, the single-family neighborhood of detached homes is not the only suitable place to raise a family. The second half reveals the lack of imagination of the Boulder NIMBY; they can't fathom that not everyone wants to live the way they've chosen to live
- Building height restrictions and some of our light industrial areas are in danger of being eliminated in the service of more housing. Maximum building height is a special topic for the Boulder NIMBY. Being able to see the mountains from anywhere in Boulder is super important to them. I find it fascinating: "we'd like to build more housing so people can afford to live here," and the NIMBY answer is "no, I need to see the mountains from everywhere." I'm not sure about the concern about light industrial areas except as a way to argue against zoning/development changes in general
- Our neighborhoods are threatened by a city council majority who clearly want to increase occupancy and make changes to our zoning regulations, eliminating single family zoning. Well, yeah, duh. Single family housing is probably the top reason why there's a housing crisis. The problem is NIMBYs act like getting rid of single family housing means iron foundries will be built in residential areas. No one wants that, but would it be so bad to have more (any?) missing middle housing in residential neighborhoods? Will a triplex cause your neighborhood to descend into a slum? A corner coffee shop or salon/barber wouldn't ruin your neighborhood, I guarantee. Notice that they cite the "city council majority." I remember voting for the city council and many of the current members openly campaigned on doing the things these NIMBYs dislike. This is how democracy works, folks
- Driving through cross-town traffic during the day is daunting. Here's the big tell of these NIMBYs. They expect to be able to drive in town during the day and not encounter traffic. Above they complain about the city seeming less pedestrian-friendly, and here about traffic. Any measure to make the city more car friendly is certain to be less pedestrian friendly. They want to be able to drive without traffic, and want a pleasant walk where they choose to walk, and they don't care about where other people might want to walk if it makes their drive "daunting." The only want to accomplish that is if their walks never go to any destination one might drive to because cars and pedestrian accommodations are opposites in cities. Making space for cars means taking it from pedestrians, cyclists, and other modes of transportation
There's more on that page, but I'm going to stop here. I think all of this boils down to NIMBYs being selfish and not accepting that places change. I saw the graphic below some time ago, and I think of it often when hearing NIMBY complaints about change. Things (vibes!) were just better in the past, and anything that is change is bad.
Finally, because the group is so new I suspect that the Boulder Action webpage will experience change in the near future. I've put a screenshot of how the page looked when this post was written here.
more ...The Cranberries - No Need to Argue
Rising to #6 this week, No Need to Argue by The Cranberries was the band's second album after Everybody Else Is Doing It, So Why Can't We? and is their top selling album. It features the international hit song Zombie that according to the Wikipedia page, "wasn't grunge, but the timing was good," but definitely has grunge elements in my opinion. I think the only other single off the album I remember hearing on the radio was Ode to My Family.
The lead singer Dolores O'Riordan had a highly distinctive voice that's instantly recognizable. They surely would not have achieved success without her.
The Cranberries have long-lasting appeal. My 12 year old daughter loves Zombie. This album is definitely worth checking out.
more ...