Algorithm

“It’s all just/an the algorithm.” We hear it a lot: bandied about in media coverage of, well, the media; used as an explainer for why Facebook knows you like teacups with dragons on them and why Amazon suggests you purchase tissues and why you see those ads in your Gmail about bulbs or deer or survivalist stuff. (All true, btw). I think there’s a decent size of the population that has a context-specific definition for algorithm (e.g., I know that this means a black box in which things are magically done and then Instagram *just knows* that I like fitness videos) but not an *actual* one, which means when I hear that “the algorithm knows” I have no problem with GMO’s but do prefer organics and less-processed foodstuffs, I think that it “just knows” without really understanding what that means.

So here’s a primer of algorithms, because this is what goes through my overcaffeinated brain of a Sunday morning. If you’d like to understand more about them, or if you’d like to explain them to someone you think should understand them more, this one’s for you.

Super-Basic Basics

The first thing to know about algorithms is they are not smart. They have no intelligence whatsoever. They’re basically an equation, a formula, a set of rules by which one or more pieces of data (“Bobbie likes pie”, “Bobbie tracks her food on MyFitnessPal”) gets “looked at” and then somewhere checked against a list of criteria (“People who like pie like junk food”, “Women who track their food are on a diet”) and then a “logical” conclusion is spat out. You actually can use algorithms in your day to day; you probably already are. Just like the algorithms in your brains, algorithms in computers are built by humans.

Example

For example: up until 2020, I drove about 20,000 miles per year. For those non-drivers in the world or those who are based metric, that’s more than average. Most dealerships will assume, for their “bring you in for maintenance purposes”, that you’re driving about 12-15,000 miles per year. Because I had a relatively new car up until 2020, and because it was covered for maintenance through some package deal I bought, I was bringing my car in every 5,000 miles. However, the dealer had an algorithm for every 5,000 miles based on what they considered “typical use”. This means that they’d always want to schedule my next maintenance 4 or 5 months from my current one; and I’d frequently have to bump it sooner, because at 20,000 miles/year, I’m driving 5,000 miles every 3 months. I know this and because it was nice simple round numbers, I didn’t have to have a spreadsheet on it. My driving mileage has been pretty consistent for 15 or so years. So the *algorithm* we’re looking at here, to predict when my next appointment is, is Number of Miles Per Year Expected / 5,000 = How many Times Per Year my car gets serviced. Then it’s How Many Months Per Year / How many Times Per Year my car gets serviced, to how many months between each service. If I wanted to be fancy, I could write that as (Months Per Year)/(Miles Per Year Expected/5000). The reason the dealer and I get different numbers is that while we both agree on how many months there are in a year, they are working with a different Miles Per Year Expected. The *algorithm* isn’t wrong, because it isn’t *right*, either. It’s all dependent on what goes in, to determine what comes out.

What Happens When Things Change

Now that we are in COVID restriction, I still drive quite a bit to go visit immediate family every 2 weeks, but aside from that I’m working from home and I’m working out at home and so I don’t drive nearly as much. The *algorithm* still hasn’t changed — but the Miles Per Year Expected has. So now, my number looks a lot more like the dealer’s number — I’m driving about 12k miles/year, and so I would come in every 4 or 5 months. If the *dealer* changes their expectations, though, thinking “oh wow people aren’t driving with COVID we should bump that down to like 5k/year”, then our output of the algorithm will once again differ.

Slightly More Sophisticated Stuff

Simple algorithms are like the one above, it’s got one or more inputs (expected miles per year) and at least one output (Bobbie needs to get her car serviced in June). You can add more inputs, though, and some “checking stations”. These can be what are called “if” statements (If Bobbie likes strawberry pie then assume excess calorie consumption from April to July; if Bobbie likes blueberry pie then assume excess calorie consumption from July to August) which in turn can be on other “if” statements (If strawberries then In Season = April, May, June; If blueberries then In Season = July, August). You can take these “if” statements, or conditions, and sprinkle them in all of the parts of the algorithm: at the beginning, middle, and even with the ending to determine the ending.

Again, you probably do this all the time. Say you’re at Costco. I don’t know about you but I like to limit my Costco trips because crowds are not my thing; also because I like to limit my trips in general (I’m the sort of person who has a categorized grocery list). Most folks have a grocery list, and most folks have a Costco list. You’re at Costco, and they have special pallet stacks of stuff on sale (the pricing usually indicates how much off). And you’re in front of the toilet paper, which was not originally on your list. This is a more sophisticated algorithm you’re running in your head:

Inputs:

  1. Toilet Paper is On Sale
  2. Toilet Paper is 36 rolls
  3. Sale is only good for about 1 week
  4. I am not coming back to Costco for at least 3 weeks.
  5. How much toilet paper do you have at home

Evaluation: Here you need your algo to check a few things:

  1. Do you have the money in your planned budget for the extra toilet paper that was not on your list? – this is an evaluation that you can do with only one of the inputs – the Sale Price
  2. Do you need toilet paper between now and the time you *think* it will next be on sale? – this evaluation is done with the input of the volume of toilet paper you have at home, plus the amount of time between now and when you think it could be next on sale. (You know the next time you’re coming to Costco, in at least 3 weeks. But it may not be on sale then.)
  3. Do you have the storage capacity for the extra 36 rolls? – this evaluation is done independently of 1 and 2 — straight up can you stock 36 rolls or not?

As you evaluate each of these, you spit out the “result” of your algorithm, perhaps as these steps (remember, these assume you didn’t need toilet paper right now, and that this was just something to evaluate on top of your regular list):

  1. If I have money for this, then go to step 2. Otherwise, keep rolling my cart.
  2. If I think toilet paper will be on sale the next time I am here,
    1. *AND* I can last that long until I need toilet paper, then keep rolling my cart, else
    2. *AND* I cannot last that long until I need toilet paper, go to step 5
    3. If I think it will not be on sale next time, then go to step 3
  3. If it is worth it to me to delay purchasing the toilet paper for next time at the expense of the sale price (e.g., is 3 weeks wait better than $4 off?), then keep rolling my cart, else go to step 4
  4. If I can store the toilet paper, go to step 5. Else, keep rolling my cart.
  5. Buy toilet paper.

Here’s the thing: this evaluation happens in the space of a minute or two in your brain, standing at the endcap of toilet paper in Costco while trying to avoid getting sideswiped by carts and small children running to get the free food. You probably spent more time reading through that list than you would actually doing the evaluation in your head, at Costco. You’ve just run an algorithm, because you could easily have replaced “toilet paper” in this decision, with say, “steaks” or “beer” or “high-end whey protein shake mix” or “kale” or “salmon” or “bread” or any of a number of consumable goods. You could replace the windows of your visits to Costco with different figures (I know folks who go every week, every two weeks, only when needed, etc.). You could replace the amount of the sale price in the evaluation (e.g., $4 trade off for your visit window may be enough. But is $2? Or would $10 be a good trade off of convenience for a 2-month window? etc.). The *steps* are the same, the kinds of things that you are checking in the steps are the same, but the specifics differ from situation to situation.

Algorithms In the World

When we say “Facebook runs an algorithm and so they know you like Argyle Socks”, we mean that Facebook has a HUGE volume of inputs (ones you give it and ones it infers and ones it purchases) and a HUGE volume of conditions it evaluates.

It can for example extrapolate from the data you give it (say, photos, comments on friends’ posts, clicks you do *on Facebook*, etc.) that you like socks.

It can infer from things your friends post, or from cookies it drops (think: little text tracker that sits in the background of your computer that, when you leave Facebook.com, gets “looked for” by other websites that Facebook has deals with. That rando website checks to say “hey computer you got a Facebook cookie?” and your computer says “yup I got a Facebook cookie, it’s cookie number bla-bla” and that website says “cool beans thanks I’ll make a note of it”. Because Facebook *made* the cookie, it knows that bla-bla belongs to you. And because there’s millions of sites that Facebook agrees to check for cookies with, that sites that Facebook does not own or operate, Facebook can know that you went on Target, for example, and shopped for argyle socks.).

Facebook also straight up purchases data. “Hey argyle sock company, let me know the typical demographic by zip code of people who buy your socks!” When the argyle sock company comes back and says “ok so like in 98074 the typical argyle sock purchaser is female (we infer this because they bought women’s argyle socks) and over 30 (we infer this because she didn’t use pay pal or apple pay she used like an old school credit card)”, Facebook can marry that up with marketing data that says the average 98074 female over 30 also is also married with an income bracket of XYZ and likely owns and doesn’t rent.

Facebook can then take all of *that* data and run it through *another* set of checking stations and say ok so if she likes argyle socks then with this other data we have about her *what else* can we market to her? Maybe there’s a high correlation of female argyle sock wearing disposable income homeowner to coffee consumption. Let’s try that. Oh, did she click it? Our checking stations were *right*, let’s use them more. Oh, did she not? More data for the checking stations.

This is just one (very tortured) example: nearly every site you interact with (not just Facebook or its properties), every company that you purchase goods or services from (e.g., banks, insurance companies, etc.), and most especially every company you work with that gives you something “for free” (e.g., Instagram, Snapchat, Pinterest, etc.) collects this information, and has their own special list of algorithms they chug through and spit out ideas as to what you like or don’t like, what you do or do not want. Sometimes they sell these ideas, sometimes they purchase other’s ideas and marry them up with *their* ideas to get super-specific ideas about you. The more inputs they can get, the more outputs they can test, and the more testing they do, the more accurate they can get. This isn’t just about argyle socks either: they can suggest or infer political preference, disposable income, sexual preference, charitable leanings, religious leanings, and so forth. They can then market to you based on what they think you want to hear, or want to read, or want to buy.

All just an algorithm.

Freedom

It’s that time of year again, where kids are out of school and we all forget about the responsibilities and management associated with education. School’s out for the summer!

Here in Washington State our legislators have come up with a budget (after two special sessions, for which, may I remind you dear voter, our congresspersons get paid). It got signed in, but doesn’t include the funding for the recent education bill that got passed, which totals slightly over $2 billion. Out of $38 billion, that means we’re missing about 5% or so of our budget. As much as I want to look at that and still give us an “A”, I’m a pretty harsh grader.

This little rounding error is for reduced class sizes, voted in by the constituency. The reason why there’s no funding for it is the measure didn’t include a funding resource, which is like saying “Do you want to have free groceries?” as a voting item. Of course you want free groceries, or reduced class sizes. When we don’t address how it’s going to get paid for, however, we end up with extended sessions and bickering and our very own elected officials trying to delay a measure we elected to have.  A funding measure wasn’t included, though, because as soon as you mention the possibility of raising taxes — of any sort: real estate, business, sales, or (eek!) instantiating an income tax — people lose their collective shit.

Here’s the thing: we can get mobilized around *some* social progress. We have gay marriage and subsidized healthcare and it only took Donald Trump one speech to ignite and unify the Latino vote (hi, I’m one of ’em, Donald) and get NBC, Macy’s, etc. to drop him like a hot potato. We are a country moving towards better social freedoms, recognition of our needs as a society, and intolerance of intolerance.

“We” (and by “we” I mean our dear, elected officials) do this because of one very simple reason: those movements represent votes. They get the Latino vote. Or the gay vote. Or the elderly vote. Or the African-American vote. Or the women’s vote. They love those voters! Those voters will help them *win*. It will be great.

As long as those voters aren’t educated.

We live in a country that is 14th in the world for education — and a state that is 20th in the US. Those figures are dropping with each year.  You don’t have to be smart to vote, and when you have your Legislative Branch playing games with numbers to “pass a budget” that doesn’t include all of the things that it is required to pay for, it’s better if the voters aren’t smart.

I live in a good school district. Our kids get issued laptops.  One of the more common rejoinders to this is: if the school district can furnish laptops, why can’t it pay its teachers (or reduce class sizes)? Great question.

Local school districts augment federal and state money (because it’s not enough) by levies and bonds. Here in our county it’s not uncommon to see an education bond measure every two years — for this district or the one down the road — to cover a given thing. Technology levies are separate from operating levies are separate from capital bonds (the latter used for building new schools). So if the tech levy passes but the operating levy doesn’t, you get computers but no one to administrate them.

Let’s take a look, then, at the operational cost of a teacher — that’s really what it comes down to, right? The teacher is who your child interacts with on a daily basis, they’re the ones that “take all summer off” and “Only work like 6 hours a day and get multiple in-service days and spring break and such”. Let’s look at a “Schedule C” teacher, who has either a BA and 90 credits or a Master’s Degree. We will take one who is 5 years in. That teacher makes $43,607/year. (Note to those who go look up those hourly rates — those are based on in-class hours. They are not based on hours worked).

Let’s further say the teacher doesn’t work at all during the 10 weeks of summer (they actually go in a week early, but it makes the math easy), or spring break (1 week), winter break (2 weeks), and holidays (Veteran’s day, Day after Thanksgiving, Presidents Day, Mid-winter break adds up to a week). I exclude Thanksgiving and Memorial day because they are typically off for everyone.

OK so 52 weeks/year, minus 10 for summer, 3 for regular breaks, and another for miscellaneous days == 52-14=38 weeks. That translates to $1147/week, before taxes, or an hourly rate of $28.67. Woo hoo! Riches behold!

Well, wait. Do they really work 40 hours?

My son’s school starts at 7:4oam and gets out at 2:10pm. Teachers are expected on-campus by 7:10am. So let’s assume they hightail it out of there with the kids and do not stay late to cover detentions (they do), test retakes (ditto), clubs (which they do and it’s usually on their own time, but it’s a choice so we will ignore that). That’s 7 hours. Oh, they get lunch, for 40 minutes. That means 1 hour, 40 minutes short of an 8 hour workday.

Except there is no room in there for lesson planning, grading, etc. Six classes at 30 kids/class is 180 kids worth of papers to grade, tests to grade, and lesson plans. Fine. Let’s be super-generous and say that is used up with that 1 hour and 40 minutes. (Note: my kid averaged 3 hours of homework per night in 6th grade. Each class had one graded item per night, roughly, not including major projects and papers. Translation: go through roughly 180 pieces of math homework and check the answers and they showed their work correctly. At one minute per paper you have used up all of your 100 minutes and then some).

Great! We’re done.

No, we’re not. These days, your dear teachers are expected to answer email from students and parents. This averages 30-50 per day (I am not exaggerating, I asked a bunch of different teachers — and I know I contributed to that count more than a few times). Call it 30 per day at 1 minute to read and 1 minute to respond– that’s another hour. Then add in IEP meetings (teachers with a student in their class in an IEP attend one or two of these a year — and there’s about 2 per class, so 12 per teacher) and those add up to another 15 minutes a week. Then add in staff meetings, call it another 15 minutes per week.

With me? Your 40-hour per week teacher is now at roughly 48 hours/week. Let’s go back and do that math again: $24/hour. Looks great! Except remember we removed all those weeks off the teacher gets — we assumed s/he didn’t get paid for that period.

Now lets look at how much “life” costs.

  • Take off 20% for taxes.
  • The cheapest 2 bedroom apartment I could find within a 20 minute drive (because there is a gas/transportation trade off here) is $1200 ($14,400/year).
  • $300/mo for food
  • $100/mo for transportation — bus and/or gas money/insurance
  •  $150/mo electric/gas
  • 10% for retirement

That’s $2294-(20%*2294)-1200-300-100-150-(10%*2294)=2294-458-1200-300-100-150-229=and guess what we’re in negative numbers. Because after I take out electricity/gas we have only $86, and that’s what the teacher can put to retirement.

As long as they don’t have kids. Or pets. Or hobbies. Or unforeseen medical expenses. Or mandatory union dues. Or chipping in for the kid who can’t afford school supplies. Or student loans, because our higher education system is horrifically messed up, too.

Today we celebrate our independence from a government that wanted to give us taxation without representation. We need to look at our government today and understand our responsibilities, and theirs. We pay the taxes. We may need to pay more. In turn, we need our legislators to represent: not just because they “let” us have the freedoms we were already granted (my 12 year old was shocked to find out gay people couldn’t get married already) in our constitution, but because we put the legislators where they are today.

If they don’t represent what we need, then we need to put others in there who do. That is the ultimate freedom we have as Americans, and we need to remember it, and use it.

Great Expectations

‘Tis the season here at My Big Software Company, where we rate ourselves and rate our peers and rate our managers in a method that doesn’t actually impart A Number, you see, but is still used to determine those numbers which are most important to working folks: how much you get paid, either in one shot (bonus), in the future (stock), or over time (raise). In other words, it is review season, and it sucks.

I don’t care how careful HR is and how well prepared they are. I don’t care what the template and tools are you are given to follow. The fact of the matter is that at least once a year and, ostensibly, four to six times a year, you are sat down and are told to quantify, in a variety of ways, the working worth of the people around you. And they are told to do the same about you. It sucks.

It’s horrifying and necessary. This process is meant to weed out the freeloaders, the bad seeds, Those Who Do Not Fit for a better word. As a manager I dreaded reviews (because as much as everyone says they want to lead a team of rock stars, guess what happens when you actually do? Now you have to rate rock stars. Which means only a few rock stars can be the rock stars of rock stars. Talk about splitting hairs.) As an employee I dread reviews (because as hot shit as I can think I am — and sometimes I really am — like any teenager staring in the mirror there are a load more times where I wasn’t even a lukewarm fart).

That more companies are moving to a system where this is not technically quantified in numbers — e.g., as a manager I would not say “Jane” is a “3” on a scale of 1-5 (for, you see, historically Janes and Jons were appalled at being reduced to a number)–means that this gets harder, not easier. How do I tell you that you are doing “pretty good” but not “really good” and so you only get a mediocre raise? How do I tell you I had to compare you to the guy who came in under-leveled — in some cases by 3 levels — because of someone ELSE’s hiring error, that has nothing to do with you? I don’t. I just tell you where you fell on the curve.

One of my favorite memes is the one that is attributed to Kurt Vonnegut but wasn’t — and was later imparted by Baz Luhrman on “Wear Sunscreen” — tells you that the race is long, and in the end, it’s only against yourself. If there were some way to measure one’s improvement against oneself, and then weight that within reasonableness (because frankly, I can have a deliberately shit year and then bust my ass for an easy “improvement” rank), that would be better.

Interesting point of fact though: we hold our kids to numbers.

My kid is in 6th grade — almost 7th  (2! weeks! to go!) — and is held to the standard 1(D), 2 (C), 3(B), 4(A) scale that I grew up with. Every assignment is reduced to numbers and faithfully reported and published (to the point where I often know his score before he does). This number — and numbers in standardized testing, either within the school or external to the school (Washington state is on its 4th or 5th “standardized’ test in the last 10 years — none of which equate to one another, so it’s a constantly shifting field)– will determine what classes he can take, which math path he is on, if he can participate in extracurricular activities, etc.  And he’s 12. Whereas his mother is 30 years his senior and doesn’t have the “advantage” of a number.

As a society we constantly worry about preparing our kids for the future, to be competitive within the global sphere. They are learning things 2 years earlier than they did at my age — both by math formulae and science concepts. They are expected to perform and they are connected in a way we never were — the kids are handed laptops as a required tool for school. The internet was this totally shady side thing when I was in school and generally not talked about. Now it’s a project to tell him about how plagiarism works and that Wikipedia is informative but cannot be your data source. We grade them and numericize them and then let them take and retake tests as needed to make sure the number fits. In short we are preparing our kids very, very well in one way, and very, very poorly in another.

In the working world, you are held to a numeric standard but it is never actually communicated to you. In the working world there are damned few test retakes and there is little extra credit. It’s this world full of meetings and 1-on-1’s and phraseology without hard-core definition. In the student world it’s the opposite: little individual time and little talk, all strict grading and numeric application. In college this gets less personal and more regimented. We train our kids to know things, but not apply them.

This mad scramble that results, inevitably, in a new testing method every two years or so means that we are trying to hit a moving target with a bow and arrow while on the back of a truck in the middle of an earthquake. Instead of sticking with one test– however suboptimal– we change the test in hopes of finding some “perfect test” that will make everything sane. Instead of gearing curriculae towards the Real World, we chase some phantom metric that is meant to make us feel better about being twenty-somethingth — or is it thirty-somethingth, now?– in the world on education. When we were, at one time, first.

We are two weeks out from the final grades that will numerically identify how “well” my kid did in school this year. We are two months from the longer, more complicated, not-numerically-driven conversation with my boss about how “well” I did at work this year.

In neither case can we state with confidence that the analysis was foolproof, regardless of the outcome.

Solving for X

It’s been some 20 years since I last messed with PreCalculus and I was apprehensive as the quarter started. I mean, do you remember how to factor a quadratic equation?

Most of the last six days I’ve spent pouring over my online textbook, doing the requisite problems and watching the requisite videos, trying to get back into the hang of things, mathematically.  Part of the problem is that the first time I took this in school it was to satisfy a separate need: as a Marine Biologist, how often was I really going to need to use trigonometry? Or create mathematical formulae to describe something? You never saw Jacques Cousteau whip out a Texas Instruments graphing calculator, so I spent four or five quarters of advanced math thinking, “yeah, yeah, but this doesn’t really apply to me”. I studied long enough to get the grade and not one moment longer.

Here we are 20 years later, I’m in the same class (in the same school – although not with the same teacher) doing the same work, and have discovered two things:

  1. It’s a lot easier to do the work if you understand the theory and are studying to that rather than the formula itself – if you get the “concept” you can back into the “formula”, it doesn’t work so well the other way around, and
  2. The newer textbooks have pretty much accepted you’re going to rote-memorize some things and probably don’t care about the formula.

Yep, you read that right. For example, one of the things I find now in my text are handy “tables” that tell you the “standard answers” for common mathematical functions. Twenty years ago, we had to demonstrate mathematically WHY, for example, the sin(pi/6 aka 30o)=1/2.  You got out your quadrille paper, you graphed a unit circle, you labeled stuff, drew your arc, and did the math. Now, you have a table. This helps, right?

Not really. Sure, you have a handy table, and you go and apply that to all of the problems in the homework. Or you leverage your graphing calculator to tell you that sin(30)=0.5, no problem. But when it comes time to use what you have learned so far to apply it to a new concept, or to solve a problem where there is more than one missing value, you’re hosed until you get another table or some set of instructions on what to plug into your TI83.

As I’m actually going to USE this math in Economics – first quarter Microeconomics shows you enough graphs and charts that you immediately understand the significance of Understanding What The Graph Is Actually Telling You and How To Derive a Formula For It – I wish the textbooks actually worked to have you get the theory as much as they do the application. This is like when you’re at work and your boss asks you to provide a presentation and then hands you the template and tells you exactly what to write – that’s great, but I’d really like to participate, please.