Alex Gude

Steelcase Gesture Review: A Disappointing Upgrade

2024-09-02T00:00:00+00:00

Although I have a standing desk, I still spend more than half of my day sitting. For four years, I used the secondhand office chair my brother-in-law gave me during the pandemic, but it was falling apart even before I got it. So I decided it was time to upgrade.

I tried multiple different chairs in person, and although my heart wanted an Aeron, my sit-test (for 10 minutes) told me I should get a Steelcase Gesture. Giving me confidence was the fact that The Wirecutter had rated it their best chair, saying “This is one of the most adjustable chairs available—anyone can make it comfortable, regardless of their height or size.” After eight months, I’ve decided they were wrong.

Overall, I am disappointed with the chair, which has an uncomfortable back, poor recline, and arms that just don’t get out of the way enough to get close to my desk. I have replaced it with an Aeron Remastered Size C.

My Chair

I bought a Steelcase Gesture with:

Headrest
Upholstered Wrap Back
Cogent Lizard Upholstery (5S94)
Platinum Metallic Frame
Matching Base
360° Arms
Lumbar Support
Wheels for Carpet

The retail price for this chair is $1679, but I got it through my work’s office supplier for $1079.

Headrest

I debated whether or not I wanted a headrest and finally decided to get one. You can mostly fold it out of the way if you don’t like it, so I figured I’d rather give it a shot. It’s fine. I use it when lounging but not when sitting normally. I would not get it again because it is superfluous.

Arms

One of the selling points of the Gesture is how flexible the arms are. They are very easy to move, easier than the Aeron’s, but I didn’t find this to be a great selling point. Generally, I set the arms once and leave them forever. I don’t adjust them for each new position or task as Steelcase assumes.

Additionally, the arms aren’t as nice as the Aeron’s. The Gesture’s arms don’t go as low, about 6 1/2 inches off the seat compared to 5 1/2 for the Aeron. This is extremely important to me because I have very long arms compared to my torso; the Gesture’s arms force my shoulders up. The Gesture’s arms also stick out further when pushed all the way back (10 inches from the back of the chair, compared to 8 for the Aeron). This makes it harder to pull the chair under a desk to get close to the keyboard. The pads are a little thinner on the Gesture, making it slightly less comfortable.

The Back

The Gesture has a very upright back. It forces me into a very straight position, even though I normally prefer a little bit of recline. The back moves in two ways: it tilts top to bottom to try to adjust to the curve of your spine, and it reclines.

The back is also very tall, extending about 24 inches above the seat (not including the headrest). Because it is very straight, it makes contact with my shoulder blades in a way I found uncomfortable. For the first week of use, I got upper back pain which I assumed was just my body adjusting to “sitting correctly”. Eight months later and I’m not so sure; the upper-back pain persists (although far less severe) and my doctor thinks my posture is most of the problem. I had no problem before using the Gesture.

Recline

The Gesture’s back can be locked into 4 different positions, and the spring tension can be adjusted from “you can’t push this back if you tried” to “absolutely no resistance”. Unfortunately, the recline just doesn’t feel very good on my chair. It is either too stiff or too loose. There is no middle ground that feels supportive. I think my spring is broken because the Gestures we have at my company’s office have better tension control and a more comfortable recline.

The back moves separate from the seat, which feels OK. On the Aeron, the seat slides a bit when you recline which feels more natural. Overall, the recline on the Aeron feels much better.

Lumbar

The lumbar support is TERRIBLE. It pokes me in the back and feels like it’s trying to push me out of the chair. The curve of the backrest is already aggressive, and the lumbar support makes it more so. There is an entire thread on Reddit¹ about how the lumbar support ruins the chair. I followed Steelcase’s guide to removing the backrest pad to take out the lumbar support and it helped a lot (but I still didn’t find it comfortable). I would not buy the lumbar support again.

The Seat

The seat depth is adjustable. Because I’m tall (6 foot 1 inch), I have the seat all the way forward. This leaves a small gap between the back of the seat and the backrest. The height is also adjustable, and I can make it high enough to fit me.

Some people complain that the woven and padded seat retains heat and isn’t comfortable on hot days. I used the chair during the summer and didn’t notice heat problems, even when it was 110 °F outside and a little over 80 °F in my office.

Final Thoughts

I don’t like my Steelcase Gesture. For eight months, walking into my home office each morning came with a sinking feeling, knowing I’d have to sit in this chair. I replaced it with an Aeron, and the difference is night and day—I couldn’t be happier with the switch.

So I got a new Steelcase Gesture and for about a week couldn’t understand why does this cost $1k+ and how someone can possibly prefer it to Aeron. I even started to get the back pain in the middle of the spine - something that has never happened to me in my seating life ever.

Then I read here about Lumbar support being in the way of their flexible back. I could not really find the right position for the lubmar so I figured ok I will just remove it, if it doesn’t help I will sell the damn thing at loss asap.

So without the Lumbar Support Gesture is amazing. The back flexes instead of cutting into your spine, you can lean on the chair and if you touch it behind the backrest you would notice that the design has the ribs that are supposed to have some give.

The lumbar support itself is fairly rigid piece of plastic that, frankly, feels alien inside the chair. Once I took it out, I felt a bit like I did some life-savingv surgery on the poor thing.

I have no idea who would prefer to have the lumbar support in it, especially compared to how good it is without it. I’d say with lumbar support it’s a 6.5 chair (and, considering the price for the new one, more like 5.5). Without it’s a solid 8.5-9. My only gripe now is that it’s on the hot side, I wish it had a mesh seat, but I think I can survive, the adjustable arms and the overall smoothness of the experience without the lumbar plate is worth it.

I had a headrest version, and the removal procedure isn’t exactly trivial and takes some force, but took me about 30 minutes. If you have the wrapped back be prepared to take some risks, inserting the lever underneath it to pop it off.

After that you screw off 4 screws (torx) and slide up the front seating pad. Then you’d need to carefully slide out the lumbar plate (I did not have to disconnect anything there, just the textile in a couple of places).

That’s it.

myreptilianbrain. “PSA: Remove lumbar support from your Steelcase Gesture.” Reddit, r/OfficeChairs, 2021-12-18. https://www.reddit.com/r/OfficeChairs/comments/rixars/psa_remove_lumbar_support_from_your_steelcase/ ↩

My Favorite Books of 2023

2024-01-01T00:00:00+00:00

Social media died for me in 2023. Twitter and Reddit both shutdown their third party API making accessing them a pain, and the quality of Twitter decreased significantly with its sale to Elon Musk. I realized how much time I had wasted on social media and decided to spend it more wisely by reading books. I bought a Kindle in October and finished 20 books between then and the end of the year.

I review all the books I read, here is a selection of my favorites that I read in 2023:

Blindsight by Peter Watts

Blindsight does a great job of exploring the nature of consciousness and intelligence. Watts keeps the tension high and the plot moving quickly in this thought-provoking sci-fi novel. My favorite book of the year!

Blindsight by Peter Watts

★★★★★

Blindsight is a hard sci-fi novel about first contact with aliens in the near future. A crew of four transhumans and a vampire are sent on a spaceship to investigate an anomaly in the solar system after a swarm of alien probes scan Earth.

The Chronicle of the Fallers by Peter F. Hamilton

Hamilton is known for his space opera, but The Abyss Beyond Dreams is more urban fantasy set during the Russian Revolution (in space) and Night Without Stars is a thriller set during the Cold War (again, in space). Both feature Commonwealth citizens with special knowledge as “Outside Context Problems”, pulling the stories into science fiction territory.

The Abyss Beyond Dreams by Peter F. Hamilton

★★★★★

The Abyss Beyond Dreams starts off The Chronicle of the Fallers, another series in Hamilton’s Commonwealth universe. Though billed as space opera, it often reads more as urban fantasy since most of the story occurs on the planet Bienvenido inside the Void where steam engines are their most advanced technology.

Night Without Stars by Peter F. Hamilton

★★★★★

Night Without Stars is the second book in the Chronicle of the Fallers. It is action packed, with great pacing, and complex characters. It is my new favorite Hamilton book.

The Fall of Hyperion by Dan Simmons

I enjoyed the sequel to Hyperion the most of the two books because it tied the personal story of the pilgrims to a much broader galactic conflict. Interestingly, you can see a lot of ideas in the Hyperion Cantos that Hamilton later adopted in his Commonwealth Saga including wormholes, a breakaway-but-helpful AI, and different factions of scheming AI who either want to eradicate the humans or uplift them.

The Fall of Hyperion by Dan Simmons

★★★★★

The Fall of Hyperion is a sequel that outshines it’s predecessor. It is everything I was expecting from Hyperion and more! A true masterpiece.

The Commonwealth Saga by Peter F. Hamilton

Epic space opera with a massive cast of characters and incredible pacing.

Pandora's Star by Peter F. Hamilton

★★★★★

I couldn’t put Pandora’s Star down! It is a sci-fi book that reads more like a thriller. There were always new mysteries that just a few more pages promised the answers to.

Judas Unchained by Peter F. Hamilton

★★★★★

The sequel to Pandora’s Star, Judas Unchained continues right where the last one left off, but with the action ramped up to 11. The various storylines and loose threads come together one by one until it’s the good guys racing against the bad guys for the fate of the universe.

Serpent Valley by Scott Warren

1980s mech sci-fi re-imagined for the 21st century. Warren’s self-published series takes a few books to really find its feet, but once it does, it’s a quick, fun, nostalgic read. The third book, Serpent Valley, exemplifies the series.

Serpent Valley by Scott Warren

★★★★★

Serpent Valley, the third book in the War Horses series, is another quick, action-packed read—but without the flaws holding back its predecessors. Easily my favorite of the series so far!

Using Large Language Models To Clean Data

2023-10-15T00:00:00+00:00

I maintain the SWITRS-to-sqlite Python library that parses and cleans up California Highway Patrol’s traffic collision database. One of the fields the responding officer has to fill out at the scene of the crash is the make¹ of the vehicle. This field is a free text field, but there is a relatively small number of common brands, so it should be mapped to a categorical column.

This is straightforward when the officer writes FORD or HONDA, which they mostly do. But since the officer can write anything, they occasionally make it a little harder on us by abbreviating or mistyping, for example VOLX and DODDGE. And sometimes they make it impossible by writing -- or ______.

The solution is to go through, one by one, and create a mapping like:

# Enumeration of common vehicle makes
@unique
class Make(Enum):
  CHEVROLET  = "chevrolet"
  GMC        = "gmc"
  HINO       = "hino"
  INFINITI   = "infiniti"
  MITSUBISHI = "mitsubishi"
  # Special Token for unknown make
  NONE       = None

# Dictionary mapping raw values to Make enum
make_map = {
  "CHEVRLT":  Make.CHEVROLET,
  "HINO/":    Make.HINO,
  "INFINITY": Make.INFINITI,
  "MITSUB":   Make.MITSUBISHI,
  "TAHOE":    Make.GMC,
  "UKNOWN":   Make.NONE,
}

As someone who did this mapping by hand for over 900 entries, it is quite tedious. Fortunately, making sense of mangled text is something Large Language Models (LLMs) are pretty good at!

Automating

The goal is to perform few-shot, multi-label classification of vehicle makes. Few-shot because we are going to give the model just a handful of examples of what output we expect, and multi-label because there are many possible vehicle makes it will have to map to.

Prompting

The first step is to write a prompt explaining the task to the model, the expected return value, and a few examples of input and correct outputs. Here is a shortened version, the full one is here, starting with the instructions:

I am working with a dataset of traffic collisions from California. One of the fields is the “make” of the vehicle, for example, “Honda”, “Ford”, “Peterbilt”, etc.

But this field a free-text field filled out by the CHP officer on the scene of the collision. As such there are misspellings, abbreviations, and other mistakes that have to be fixed.

I have created a set of makes as follows (including NONE as a placeholder for unknown values). Here is the list in a Python Enum:
@unique
class Make(Enum):
    ACADIAN                 = "acadian"
    ACURA                   = "acura"
    ALFA_ROMERO             = "alfa romera"
    AMC                     = "american motors"
    ...
Take note that anything unknown should be tagged with Make.None. And do not make up new Enum values.

Then the output format, with instructions to include an explanation of its logic first, which can help model accuracy:

I will provide you with a string. You are to return a Python dictionary with the following keys, in this same order:

{
  explanation: "An explanation of why you think the enum value is a good match, or why there is no match possible.",
  input_string: "The input string",
  enum: "The correct enum from above",
  no_match: "`True` or `False`. True if there is no matching enum or no way to make a match, otherwise False.", 
}

And finally some examples of inputs and correct outputs:

For example, for the input VOLX:

{
  explanation: """VOLX is pronouced similarly to 'Volks' and therefore this is
    probably an abbreviation of 'Volkswagen'. There is an enum value for
    Volkswagon, `Make.VOLKSWAGEN`, already so we use that.""",
  input_string: "VOLX",
  enum: make.VOLKSWAGEN,
  no_match: False,
}

Answers

Since I was manually copying the prompt into the model’s web interface, I used batches of 100–200 string sorted alphabetically. With API access, I could have used retrieval-augmented generation to create custom examples for each string while sending them one at a time.

Splitting the data into batches helped the model figure out very short entries. For example, the model failed when given WNBG (Winnebago) by itself, but succeeded when I gave it the list:

WINN
WINNE   
WINNEBAG
WINNEBAGO
WINNI
WNBG 
WNBGO

I believe seeing multiple short versions next to eachother helped the model infer the right mapping.

Performance

I obtained the following performance on my 902 hand-mapped entries:

The model correctly fixed 2 entries that I had gotten wrong.
It matched 682 (75.6%) of my hand-labeled mappings.
It missed 218 (24.1%) of the mappings, frequently using made-up enum values.

This is reasonably good performance, as finding wrong entries is pretty quick (and many could be fixed with find and replace).

The “make” of a vehicle is the brand of the manfacturer, like ‘Honda’, ‘Ford’, ‘Tesla’, etc. ↩

A 1850-mile Review of the RadWagon 3

2023-07-24T00:00:00+00:00

My wife and I bought a RadWagon 3, an inexpensive electric cargo bike made by Rad Power Bikes, right before the pandemic hit in early 2020. We wanted a cargo bike that we could use to haul kids around, and we decided on a cheap one since we weren’t sure it would fit our lifestyle.

We’ve now put 1850 miles on the bike, mostly taking the kids to local parks. After all the time spent with it, would we recommend it to another family?

In short: we love the utility an electric cargo bike offers, and I think we will always have one in the garage to supplement our minivan, but the RadWagon has some significant draw backs that make recommending it difficult. Read on for the full review.

The Bike

We ordered the Rad Wagon from the Rad’s website for $1500. With a “caboose” enclosure and pads for the kids adding $250, plus tax, our total was $1901—not cheap, but almost a third of comparable cargo bikes which are in the $4000–6000 range.

Assembly and Adjustment

The bike comes in a box and you must assemble it yourself. The assembly was not too hard, but I also have a lot of experience with bikes and bike maintenance which helped.

The seat post and headset are highly adjustable, allowing people of varying heights to ride the RadWagon. Both my wife (5’4”) and I (6’1”) can ride it comfortably, but we’re both near the limits. I have the seat all the way up and she has it all the way down.

Motor

The 750W rear hub motor easily brings the bike to its 20 miles per hour computer-limited top speed. The bike has a throttle, which I love for getting started from a stop, and enough power to carry me, two kids, and some balance bikes up a steep hill. The downside though is the hub motor puts a lot of stress on the rear wheel.

Flaws

A flaw in the design is the rear brake. The RadWagon uses cheap, mechanical disc brakes, which are enough to stop the bike when they’re well aligned, but which need constant attention to keep them that way. The motor blocks the typical through-spoke access for adjusting the rear brake. Instead, Rad makes a special flat Allen wrench that fits between the brake and motor but adjusting remains hard.

A major flaw is the rear spokes. They are stressed by both the motor—which puts all its power through the rear wheel—and the cargo. The spokes were not tight enough from the factory and I broke several in the first 200 miles. I have broken fewer since replacing and retightening, but I still break one periodically, which is annoying for me and likely a dealbreaker for less experienced riders.

Support

I’ve contacted Rad’s support several times—to order spokes and the brake tool, and to replace a faulty accessory. They were generally quick and helpful but I’ve never needed support for other bikes. And the last interaction was terrible: Rad’s front basket was defective, and after I sent them photographic proof, they accused me of being unable to use a screw driver and stopped responding. It was the worst support experience I’ve ever had.

Final Thoughts

The RadWagon was a savior during the pandemic, letting us escape the house and ride when we’d otherwise be trapped inside. It also beats driving kids in a car—hauling them on the back makes getting to the park part of the fun.

But the bike needs constant maintenance that is difficult even for an experienced mechanic, and Rad’s support is not great. I wish I had purchased a higher-quality bike that wouldn’t fail so frequently.

Machine Learning Deployment: Return Actions, Not Scores

2023-06-26T00:00:00+00:00

At a previous job, my team built models that stopped ATOs—Account takeovers, where a fraudster steals someone’s account credentials and attempts to use them. The engineering team that owned the login flow would call our model API, and we would return the model score. The engineering team had a threshold in their code, and if the score crossed that threshold, they would take some action.

You can probably already see the problem: APIs are meant to hide the inner workings behind them. But by returning the raw model scores, we revealed too much detail. Any changes to the model, like retraining it, could change the scores and break the front end.

In my guide to deploying machine learning models in shadow mode, I stated that deploying changes “in front of the API” has the advantage of giving the calling team control. This is precisely why we built the ATO API the way we did: to address the organizational issue that the engineering team did not trust the machine learning team.

But if your teams trust each other, there is a much better way to build.

What is a better way?

A better way is for the API to return a set of actions. For example, the ATO model API might return the following actions:

Allow: The login looks fine, allow it.
Step-up: The login looks odd, require the user to provide a second factor of authentication, such as a code sent to their email.
Lock: The login looks clearly fraudulent, deny the login and lock the account until the user recovers it.

These actions do a really good job of hiding the implementation behind the API. You can freely change thresholds when the model performance changes, retrain the model, or even replace it entirely.

But you can do something else too, you can add more models!

Using multiple systems

A common fraud-prevention strategy is to train a model for each new fraud pattern identified. This allows each model to be highly precise, while also improving the recall of the overall system. These multi-model systems are often augmented with simple rules, such as “No logins from Russia allowed.” In the end, the system takes the outputs of the various models and rules and aggregates them in some way. In our ATO example, the system returns the most drastic action recommended by any model or rule.

In code:

def ato_api(event_token):
  # List of actions returned by all the models and rules,
  # consists of values from {'Allow', 'Step-up', 'Lock'}
  all_results = get_ato_system_results(event_token)

  if 'Lock' in all_results:
    return 'Lock'
  elif 'Step-up' in all_results:
    return 'Step-up'

  return 'Allow'

Of course, this is a great place to use enums and max:

from enum import IntEnum, unique

@unique
class Action(IntEnum):
  ALLOW = 0
  STEPUP = 1
  LOCK = 2

def ato_api(event_token):
  # List of actions returned by all the models and rules,
  # consists of values from Action() enum
  all_results = get_ato_system_results(event_token)

  return max(all_results)

Claude Solves SAT Analogies

2023-05-29T00:00:00+00:00

Several years ago, I tried to get Word2Vec to solve SAT analogies. It did not go well. Word2Vec got just 8 out of 36 right.

But in the last 7 years language models have gotten much, MUCH better. I wondered how a state-of-the-art model, one too large to run on my computer, would perform on the same questions.

To find out, I ran the analogies through Anthropic’s biggest model: Claude.

Experimental Setup

I gave Claude the following instructions:

We’re going to solve SAT analogy questions. I’ll give you a pair of words like:

“authenticity : counterfeit”

And you determine the relationship between the two words, and then pick the pair from the next 5 with the same relation. So in this case I would give you:

reliability : erratic

mobility : energetic

argument : contradictory

reserve : reticent

anticipation : solemn

You need to pick:

“reliability : erratic”

Make sure to explain the relationship and why you chose the answer you did.

Then I gave it an analogy like:

fix : stable

------------------------

fortify : strong

captivate : attractive

deter : active

furrow : productive

mend : torn

And it would reply with its answer and reasoning (click “Show” to see all of it):

Okay, for this analogy:

agenda : conference

The relationship here is:

An agenda is a list that provides the plan or schedule for a conference. So a conference relies on and uses an agenda.

Based on this, the choice that shows a similar relationship is:

map : trip

When Are Large Language Models Useful?

2023-04-12T00:00:00+00:00

Large language models (LLMs) like ChatGPT, Bing Chat, and Bard have gained tremendous popularity in recent months. It feels like a pivotal moment in the technology’s growth as it becomes increasingly integrated into people’s workflows. But despite the excitement, some people are already dismissing the technology after they asked it questions and received nonsensical responses. I think they are mistaken. LLMs are incredibly valuable tools, if you know when to use them.

I gave an example in my last post of a good application for LLMs: editing prose. But what specifically makes this problem ideal for solving with a model? Succinctly, it is a problem where solving it is hard, but verifying the solution is easy. I will go into more detail in the rest of this post.

What Are They Good For?

In math, there are a types of problems where finding a solution is difficult or impossible, but confirming a solution is easy. A common strategy to solve these problems is to guess the solution’s form and then verify it, such as for an integral where the solution can be checked by taking its derivative.

Large language models are particularly useful for exactly these types of tasks: where generating a solution is hard, but verifying it is easy. Editing a paragraph is a prime example of this kind of task since writing multiple versions is time-consuming, whereas verifying the quality of a single paragraph can be done quickly.

Another good use case is writing code, especially if you have tests in place to verify the code’s correctness.

What Are They Bad For?

LLMs are bad for problems where verification is hard compared to the generation of an answer.

Some people are using LLMs as a replacement for search engines. This is a perfect example of a bad use of the technology because verifying the accuracy of the information provided by the model takes time and effort. In fact, it often involves additional searches to confirm the validity of the answer, which defeats the purpose of using an LLM in the first place.

How I Write with ChatGPT

2023-02-13T00:00:00+00:00

ChatGPT, the new chatbot from OpenAI, has received widespread attention for its versatility in conversation, information retrieval, and even song writing. However, it’s tendency to generate false but believable information has led many to see it as entertaining but not useful.¹^{(Click to see my edit prompts.)}

But ChatGPT is an incredibly useful tool for my writing due to its ability to rapidly iterate through alternative phrasings for ideas, sentences, and paragraphs. In this cases, the model’s habit of generating false information is not a concern because I wrote the original input so I know what it should say, and I can easily edit out any inaccuracies introduced by the model.²^,³

Let me illustrate how I use ChatGPT to write and copyedit my work.⁴

My Writing Process

I start by writing a rough draft which may consists of a few bullet points or a detailed set of paragraphs. Next, I give it to ChatGPT for refinement.

Prompt

Give me three versions of this paragraph:

I start by writing a rough draft. This draft could be a just a few bullet points or completely fleshed out paragraphs. I then feed it into ChatGPT:

Output

I begin my writing process with a rough draft, which may range from a simple list of bullet points to fully developed paragraphs. Next, I input it into ChatGPT.

My writing method starts with a rough draft - it could be a brief outline or a detailed set of paragraphs. Afterwards, I pass it to ChatGPT.

To start, I create a preliminary draft - it may consist of a handful of bullet points or fully written paragraphs. Afterwards, I utilize ChatGPT to refine it further.

I don’t usually take the entire output from ChatGPT, rather I use it to inspire my own editing:⁵

I start by writing a rough draft which may consists of a few bullet points or a detailed set of paragraphs. Next, I give it to ChatGPT for refinement.

I iterate using this process, refining my text by sending the sections I’m unhappy with to ChatGPT for improvement until I’m satisfied.⁶

Prompt
ChatGPT is the new large language model from OpenAI. It has garnered a lot of attention for its wide range of abilities from conversation to writing songs to information retrieval. It has also drawn criticism for its habit of making up plausible but completely wrong facts, leading many to conclude that while it is entertaining it is not useful.
Output
1. OpenAI's ChatGPT is a highly advanced language model that has received widespread attention for its diverse abilities, including conversation, songwriting, and information retrieval. However, its tendency to generate false information that appears credible has led to criticism and the conclusion that, although entertaining, it may not be practical.

2. ChatGPT, the new AI model from OpenAI, has captured considerable attention for its wide range of skills, such as conversation, songwriting, and information retrieval. Despite this, its propensity for producing believable but false facts has sparked criticism, and many view it as entertaining but not practical.

3. OpenAI's ChatGPT, a new and sophisticated language model, has drawn a lot of attention for its versatility in conversation, songwriting, and information retrieval. Despite this, it has also faced criticism for generating seemingly credible but completely false information, leading many to see it as entertaining but lacking practicality.

↩
Prompt
I find ChatGPT to be incredibly useful as part of my writing process, because ChatGPT makes it fast and easy to generate different possible ways to write down an idea, sentence, or paragraph. In these cases the fact that the model makes things up is irrelevant because I wrote the original source material so I can filter out any inaccuracies the model introduces.
Output
1. I consider ChatGPT to be a valuable tool in my writing process due to its ability to quickly and effortlessly generate various ways of phrasing an idea, sentence, or paragraph. In these cases, the fact that the model generates false information is not a concern as I am the originator of the source material and can easily identify and remove any inaccuracies introduced by the model.

2. For me, ChatGPT is an extremely useful aspect of my writing as it facilitates the rapid creation of alternative phrasings for ideas, sentences, or paragraphs. The model's tendency to generate false information is inconsequential in these scenarios as I am the source of the original material and can easily filter out any inaccuracies introduced by the model.

3. ChatGPT is an indispensable part of my writing process as it enables me to quickly generate various ways to express an idea, sentence, or paragraph. Its tendency to produce false information is not a problem as I am the writer of the original source material, and I can easily eliminate any inaccuracies introduced by the model.

↩
Prompt
I have used 'tendency' too much already. Rewrite this five times, each time only replacing the word 'tendency' with another few words to maintain the meaning:

In these cases, the model's tendency to generate false information is not a problem because I wrote the original input so I know what it should say, and I can easily edit out any inaccuracies introduced by the model.
Output
1. In these cases, the model's habit of producing false information does not pose an issue because I created the original input and can easily remove any inaccuracies it may introduce.

2. In these scenarios, the model's inclination to generate false information is of no concern as I am the writer of the original material and can effectively edit out any inaccuracies introduced by the model.

3. The model's propensity for generating false information is not a concern in these instances as I am aware of the intended content of the original input and can efficiently remove any inaccuracies it may introduce.

4. The model's tendency to produce false information is not a hindrance in these situations as I am familiar with the original input and can effortlessly edit out any inaccuracies it generates.

5. The model's habit of creating false information is not problematic because I wrote the original input and can easily eliminate any inaccuracies the model introduces.

↩
Prompt
Five versions of:

Let me show you how I use it to write and edit my work.
Output
1. Allow me to demonstrate how I utilize it in writing and revising my pieces.

2. Let me illustrate how I employ it in the creation and modification of my writing.

3. I will show you how I use ChatGPT for writing and editing my writing projects.

4. Let me reveal how I integrate it into my writing and editing process.

5. I will demonstrate to you the way I utilize it for writing and revising my work.

↩
Prompt
Three versions of:

I generally don't take the full paragraph, but I use ChatGPT's versions to get ideas for my own:
Output
1. I don't usually use the entire output from ChatGPT, but I do use it as inspiration for my own writing.

2. Typically, I don't adopt ChatGPT's output in its entirety, instead I draw ideas from it for my writing.

3. My approach is not to fully adopt the ChatGPT output, but rather use it to spark ideas for my own writing.

↩
Prompt
Give me three versions of this:

I iterate on this process, taking parts of my text I'm not happy with and sending them to ChatGPT, until I like how the work flows.
Output
1. Revising my text with ChatGPT's help, I improve it piece by piece until I'm satisfied with the flow.

2. With the aid of ChatGPT, I repeat the process of refining my text by sending unsatisfactory portions for improvement until I achieve the desired flow.

3. I use ChatGPT to continuously refine my text, sending sections I'm unhappy with for improvement until I'm content with the overall flow.

↩

AI, Artists, and Technology

2023-01-30T00:00:00+00:00

The open-source release of Stable Diffusion has sparked an explosion of progress in AI-generated art. Although it is in its infancy, I can already tell this new tool is going to revolutionize visual art creation. But not everyone views AI art in a positive light. Many artists feel that AI art stole their work¹ and have organized protests on popular sites like ArtStation. Other artists claim that AI-generated art can’t be art² because it isn’t human.

AI and Photography as Art

I come down on the side of AI-artists.

This is probably unsurprising because I am a machine learning engineer, it is my job to build the types of systems these artists are using. But what is less obvious is that my support is also because I am an artist, specifically a landscape photographer.

Photography—just like AI-generated art—has a complicated history as “art”. Although the first photograph was taken in 1826, it wasn’t until 1924 that an American museum recognized the medium as art by including photographs in its permanent collection. At first artists feared photography would replace traditional visual arts due to the ease of taking a picture. But eventually they realized it was a useful tool that could be combined with other art forms, even if they did not recognize photography as an art in its own right.³^,⁴

The concerns and criticisms currently being directed towards AI-generated art are the same as those leveled against photography in the past. And just as photography eventually gained acceptance as a valid form of art so will AI-generated art. The resistance against it may be strong, but ultimately, it is a losing battle.

My Family’s Art

My family has a long history of painting. My great-great-great grandfather was the Norwegian landscape painter Hans Gude. My father, also named Hans Gude, was an accomplished oil painter.⁵ I too wanted to make art, but I did not have their skill with a brush so I picked up a camera instead.

I was drawn to photography specifically because it used technology. I like learning new technologies and how to master them. I also thought it would be easier to make art I was happy with using a camera. I have since learned that photography has its own set of skills to master, but after 15 years I think I was mostly right: it is much easier than oil painting.

I wonder what my great grandfather would think of my art. He spent months or years creating his seascapes, while my photographs are captured in a fraction of a second with the push of a button, and maybe a few hours adjusting tone curve and highlights back at my computer.

But I like to think that he would view my work as a continuation of our family’s artistic tradition. Maybe in the future, my descendants will find the camera too complicated and instead compose prompts for AI to translate into images. To me, that’s simply another evolution of the art form.

Current AI “art” is created on the backs of hundreds of thousands of artists and photographers who made billions of images and spend time, love and dedication to have their work soullessly stolen and used by selfish people for profit without the slightest concept of ethics.

Alexander Nanitchkov (@Artofinca), Twitter, 2022-12-05 ↩
“I believe art is something inherently and intrinsically human, even corporate art made-for-hire is meticulously crafted by experts in their fields,” [Nicholas] Kole said. “When we sit down to draw, design, sculpt or paint, each mark is made with an intention. Each step of the process is an opportunity to ask new questions, tune the piece to the precise context it’s intended for, to add expressiveness and even a point of view. The result—movies, shows, games—are intended to connect that intricate craft with an audience who appreciates and enjoys it.”

AI does none of this, he explained, and he sees “a world filling up with meaningless, regurgitative cardboard cutouts that remind us of real art.”

Xiang, Chloe. Artists Are Revolting Against AI Art on ArtStation. Vice, 2022-12-14 ↩
The fear has sometimes been expressed that photography would in time entirely supersede the art of painting. Some people seem to think that when the process of taking photographs in colors has been perfected and made common enough, the painter will have nothing more to do. We need not fear anything of the kind. Perfection in photography may rid us in time of all the poor work done in color. The work of the artist, however, in which is seen his own individuality, his own perception of the beautiful, his own creation in fact, can no more perish than the soul which inspired it.

Henrietta Clopath. Genuine Art versus Mechanism, in Brush and Pencil Vol. 7, No. 6 (1901-03-01), pp. 331-333 ↩
Photography is an infinitely valuable mechanism by which to obtain records of limited abstract truth, and as such, may be of great service to the artist. Much may be learned about drawing by reference to a good photograph, that even a man of quick natural perception would be slow to learn without such help. But, unless the real shortcomings of the photograph are understood, it must certainly mislead if followed.

But beyond these merely technical matters, art differs from any mechanical process in being “the expression of man’s delight in God’s work”, and thus it appeals to, and awakens all noble sympathy and right feeling. All labor of love must have something beyond mere mechanism at the bottom of it.

Art and Photography, in The New Path Vol. 2, No. 12 (1865-12-01), pp. 198-199 ↩
My father somewhat rejected the title of “artist”, although in later life he branded himself as such. He prefered to think of himself as a craftsman, honing his skills through hardwork and study. ↩

SWITRS: Pedestrian Safety on Halloween

2022-12-01T00:00:00+00:00

In my last post, I found that Halloween is the most dangerous day of the year for pedestrians, with a higher number of incidents than any other day, according to data from SWITRS. I also found that the risk of pedestrian incidents is higher during commute hours, regardless of the date. In this article, I will explore these patterns in more detail using the same SWITRS data, but with a focus on Halloween.

As per usual, the Jupyter notebook used to perform this analysis can be found here (rendered on Github).

Data Selection

I selected crashes involving pedestrians from the SQLite database with the following query:

SELECT collision_date,
       collision_time,
       pedestrian_killed_count
FROM collisions
WHERE Collision_Date IS NOT NULL
AND pedestrian_Collision = 1        -- Involves a pedestrian
AND collision_date <= '2020-12-31'  -- 2021 is incomplete
-- and it happens on Haloween
AND strftime('%m-%d', Collision_Date) = '10-31'

This gave me 1168 data points, of which 64 involve a pedestrian fatality, spanning the years 2001 through 2020. Incidents after 2020 are rejected because the database dump comes from mid-2021, and so that year is incomplete.

Incidents Per Hour

Alissa Walker wrote¹ that it is not just drivers that make Halloween deadly, it is commuters. The best way to explore this point is to look at when in the day crashes happen:

As we saw in the data for all dates, weekdays have two major peaks in collisions during the morning and evening commutes, as well as a peak during school pickup times. Examining the data for Halloween specifically, we see that when it falls on a weekday the three expected peaks (morning and evening commutes, and school pick-up) are present, but there is also a fourth peak at 18:00, likely due to a combination of darkness making it difficult for drivers to see pedestrians and trick-or-treating bringing more people out walking. This data supports Walker’s observation that commuter traffic contributes significantly to the number of pedestrian incidents.

Fatality Rates

But Walker makes a very specific claim: that fatalities involving children increase on weekday Halloweens. Does the data support this claim? To find out, we need to look at the fatality rate instead of the total number of fatalities because the number of people driving and walking changes year-by-year and using the rate helps to normalize some of this variation. Below is a plot of the fatality rates for each year’s Halloween, separated into weekday and weekend:

The data above includes all pedestrian fatalities, not just those involving children. At first glance, the distributions for weekday and weekend Halloween fatalities appear similar. A Mann–Whitney U test confirms this, with a p-value of 0.93, indicating that the difference between the two is not statistically significant.

But what about children alone (defined as pedestrians under 18)? Here is that data:

One interesting observation is that no children have been killed by cars on weekend Halloweens, whereas about half of the weekdays have seen at least one child death. This suggests that there is something about weekday Halloweens that makes them particularly dangerous for children, consistent with Walker’s claim.

Despite this, the data does not show a significant difference between the two distributions, with a p-value of 0.08. However, this lower p-value as compared to the all-ages data does indicate some evidence for the specific claim about child deaths.

But when the commuting drivers are removed from the equation, deaths seem to go down. A study by AutoInsurance.org used FARS data to compare 24 years of crash data by days of the week. Halloweens that fell on workdays had an 83 percent increase in deadly crashes involving kids compared to weekend days. The worst day? Friday. Since 1994, the three deadliest Halloween nights for kids have all been Friday nights.

From: Walker, Alissa (October 25, 2019). The most terrifying part of Halloween for kids is our deadly streets. Curbed Vox Media. ↩