Algorithm

“It’s all just/an the algorithm.” We hear it a lot: bandied about in media coverage of, well, the media; used as an explainer for why Facebook knows you like teacups with dragons on them and why Amazon suggests you purchase tissues and why you see those ads in your Gmail about bulbs or deer or survivalist stuff. (All true, btw). I think there’s a decent size of the population that has a context-specific definition for algorithm (e.g., I know that this means a black box in which things are magically done and then Instagram *just knows* that I like fitness videos) but not an *actual* one, which means when I hear that “the algorithm knows” I have no problem with GMO’s but do prefer organics and less-processed foodstuffs, I think that it “just knows” without really understanding what that means.

So here’s a primer of algorithms, because this is what goes through my overcaffeinated brain of a Sunday morning. If you’d like to understand more about them, or if you’d like to explain them to someone you think should understand them more, this one’s for you.

Super-Basic Basics

The first thing to know about algorithms is they are not smart. They have no intelligence whatsoever. They’re basically an equation, a formula, a set of rules by which one or more pieces of data (“Bobbie likes pie”, “Bobbie tracks her food on MyFitnessPal”) gets “looked at” and then somewhere checked against a list of criteria (“People who like pie like junk food”, “Women who track their food are on a diet”) and then a “logical” conclusion is spat out. You actually can use algorithms in your day to day; you probably already are. Just like the algorithms in your brains, algorithms in computers are built by humans.

Example

For example: up until 2020, I drove about 20,000 miles per year. For those non-drivers in the world or those who are based metric, that’s more than average. Most dealerships will assume, for their “bring you in for maintenance purposes”, that you’re driving about 12-15,000 miles per year. Because I had a relatively new car up until 2020, and because it was covered for maintenance through some package deal I bought, I was bringing my car in every 5,000 miles. However, the dealer had an algorithm for every 5,000 miles based on what they considered “typical use”. This means that they’d always want to schedule my next maintenance 4 or 5 months from my current one; and I’d frequently have to bump it sooner, because at 20,000 miles/year, I’m driving 5,000 miles every 3 months. I know this and because it was nice simple round numbers, I didn’t have to have a spreadsheet on it. My driving mileage has been pretty consistent for 15 or so years. So the *algorithm* we’re looking at here, to predict when my next appointment is, is Number of Miles Per Year Expected / 5,000 = How many Times Per Year my car gets serviced. Then it’s How Many Months Per Year / How many Times Per Year my car gets serviced, to how many months between each service. If I wanted to be fancy, I could write that as (Months Per Year)/(Miles Per Year Expected/5000). The reason the dealer and I get different numbers is that while we both agree on how many months there are in a year, they are working with a different Miles Per Year Expected. The *algorithm* isn’t wrong, because it isn’t *right*, either. It’s all dependent on what goes in, to determine what comes out.

What Happens When Things Change

Now that we are in COVID restriction, I still drive quite a bit to go visit immediate family every 2 weeks, but aside from that I’m working from home and I’m working out at home and so I don’t drive nearly as much. The *algorithm* still hasn’t changed — but the Miles Per Year Expected has. So now, my number looks a lot more like the dealer’s number — I’m driving about 12k miles/year, and so I would come in every 4 or 5 months. If the *dealer* changes their expectations, though, thinking “oh wow people aren’t driving with COVID we should bump that down to like 5k/year”, then our output of the algorithm will once again differ.

Slightly More Sophisticated Stuff

Simple algorithms are like the one above, it’s got one or more inputs (expected miles per year) and at least one output (Bobbie needs to get her car serviced in June). You can add more inputs, though, and some “checking stations”. These can be what are called “if” statements (If Bobbie likes strawberry pie then assume excess calorie consumption from April to July; if Bobbie likes blueberry pie then assume excess calorie consumption from July to August) which in turn can be on other “if” statements (If strawberries then In Season = April, May, June; If blueberries then In Season = July, August). You can take these “if” statements, or conditions, and sprinkle them in all of the parts of the algorithm: at the beginning, middle, and even with the ending to determine the ending.

Again, you probably do this all the time. Say you’re at Costco. I don’t know about you but I like to limit my Costco trips because crowds are not my thing; also because I like to limit my trips in general (I’m the sort of person who has a categorized grocery list). Most folks have a grocery list, and most folks have a Costco list. You’re at Costco, and they have special pallet stacks of stuff on sale (the pricing usually indicates how much off). And you’re in front of the toilet paper, which was not originally on your list. This is a more sophisticated algorithm you’re running in your head:

Inputs:

  1. Toilet Paper is On Sale
  2. Toilet Paper is 36 rolls
  3. Sale is only good for about 1 week
  4. I am not coming back to Costco for at least 3 weeks.
  5. How much toilet paper do you have at home

Evaluation: Here you need your algo to check a few things:

  1. Do you have the money in your planned budget for the extra toilet paper that was not on your list? – this is an evaluation that you can do with only one of the inputs – the Sale Price
  2. Do you need toilet paper between now and the time you *think* it will next be on sale? – this evaluation is done with the input of the volume of toilet paper you have at home, plus the amount of time between now and when you think it could be next on sale. (You know the next time you’re coming to Costco, in at least 3 weeks. But it may not be on sale then.)
  3. Do you have the storage capacity for the extra 36 rolls? – this evaluation is done independently of 1 and 2 — straight up can you stock 36 rolls or not?

As you evaluate each of these, you spit out the “result” of your algorithm, perhaps as these steps (remember, these assume you didn’t need toilet paper right now, and that this was just something to evaluate on top of your regular list):

  1. If I have money for this, then go to step 2. Otherwise, keep rolling my cart.
  2. If I think toilet paper will be on sale the next time I am here,
    1. *AND* I can last that long until I need toilet paper, then keep rolling my cart, else
    2. *AND* I cannot last that long until I need toilet paper, go to step 5
    3. If I think it will not be on sale next time, then go to step 3
  3. If it is worth it to me to delay purchasing the toilet paper for next time at the expense of the sale price (e.g., is 3 weeks wait better than $4 off?), then keep rolling my cart, else go to step 4
  4. If I can store the toilet paper, go to step 5. Else, keep rolling my cart.
  5. Buy toilet paper.

Here’s the thing: this evaluation happens in the space of a minute or two in your brain, standing at the endcap of toilet paper in Costco while trying to avoid getting sideswiped by carts and small children running to get the free food. You probably spent more time reading through that list than you would actually doing the evaluation in your head, at Costco. You’ve just run an algorithm, because you could easily have replaced “toilet paper” in this decision, with say, “steaks” or “beer” or “high-end whey protein shake mix” or “kale” or “salmon” or “bread” or any of a number of consumable goods. You could replace the windows of your visits to Costco with different figures (I know folks who go every week, every two weeks, only when needed, etc.). You could replace the amount of the sale price in the evaluation (e.g., $4 trade off for your visit window may be enough. But is $2? Or would $10 be a good trade off of convenience for a 2-month window? etc.). The *steps* are the same, the kinds of things that you are checking in the steps are the same, but the specifics differ from situation to situation.

Algorithms In the World

When we say “Facebook runs an algorithm and so they know you like Argyle Socks”, we mean that Facebook has a HUGE volume of inputs (ones you give it and ones it infers and ones it purchases) and a HUGE volume of conditions it evaluates.

It can for example extrapolate from the data you give it (say, photos, comments on friends’ posts, clicks you do *on Facebook*, etc.) that you like socks.

It can infer from things your friends post, or from cookies it drops (think: little text tracker that sits in the background of your computer that, when you leave Facebook.com, gets “looked for” by other websites that Facebook has deals with. That rando website checks to say “hey computer you got a Facebook cookie?” and your computer says “yup I got a Facebook cookie, it’s cookie number bla-bla” and that website says “cool beans thanks I’ll make a note of it”. Because Facebook *made* the cookie, it knows that bla-bla belongs to you. And because there’s millions of sites that Facebook agrees to check for cookies with, that sites that Facebook does not own or operate, Facebook can know that you went on Target, for example, and shopped for argyle socks.).

Facebook also straight up purchases data. “Hey argyle sock company, let me know the typical demographic by zip code of people who buy your socks!” When the argyle sock company comes back and says “ok so like in 98074 the typical argyle sock purchaser is female (we infer this because they bought women’s argyle socks) and over 30 (we infer this because she didn’t use pay pal or apple pay she used like an old school credit card)”, Facebook can marry that up with marketing data that says the average 98074 female over 30 also is also married with an income bracket of XYZ and likely owns and doesn’t rent.

Facebook can then take all of *that* data and run it through *another* set of checking stations and say ok so if she likes argyle socks then with this other data we have about her *what else* can we market to her? Maybe there’s a high correlation of female argyle sock wearing disposable income homeowner to coffee consumption. Let’s try that. Oh, did she click it? Our checking stations were *right*, let’s use them more. Oh, did she not? More data for the checking stations.

This is just one (very tortured) example: nearly every site you interact with (not just Facebook or its properties), every company that you purchase goods or services from (e.g., banks, insurance companies, etc.), and most especially every company you work with that gives you something “for free” (e.g., Instagram, Snapchat, Pinterest, etc.) collects this information, and has their own special list of algorithms they chug through and spit out ideas as to what you like or don’t like, what you do or do not want. Sometimes they sell these ideas, sometimes they purchase other’s ideas and marry them up with *their* ideas to get super-specific ideas about you. The more inputs they can get, the more outputs they can test, and the more testing they do, the more accurate they can get. This isn’t just about argyle socks either: they can suggest or infer political preference, disposable income, sexual preference, charitable leanings, religious leanings, and so forth. They can then market to you based on what they think you want to hear, or want to read, or want to buy.

All just an algorithm.

Conservative

(NB: this isn’t actually political, although there are some strong parallels in parts.)

I work in the “tech industry”, for a large company, in the Seattle area. I’ve been an “engineer” (the profession includes program management, software/hardware development, and analytics, among other things) for a little over sixteen years. And I am conservative.

Believe me when I tell you I am trying really hard to see both sides.  I’ve taken steps to educate myself, even though I don’t identify with some of these new ideas. I can “speak” the language, but I don’t want to have to.  I just don’t want things to change; I don’t think that makes me a bad person but I’m watching things progress and I have to somehow keep up with it all. Why can’t I just keep querying my databases using SQL?

(Was that introduction meant to be double-entendre? Yes.  Did it get you all shocked and thinking that perhaps neither side of an equation is necessarily the extreme? Hopefully.  Does that mean there are no extremes in the world? Of course not; the phrase “extreme”, by its very definition, means it is atypical. Finally, yes, learning and growing as human beings is a hard but necessary thing.)

I went back to school in the early millennium to learn about computer programming and database theory, having resisted “going into computers” as the family trend. I fell in love with database theory (and practice) and at that time the name of the game was SQL (Structured Query Language). (For understanding, the name of that particular game *had been SQL* for at least a double decade already).  I am, very much, a creature of structure and definition and I found SQL intuitive and easy.  I spent the next decade working with it, refining my skill, getting it to do some perhaps unnatural things, and generally enjoying that this was a language I could speak.

Fast forward another ten years or so and I’m in a different role (and have been in different roles; my LinkedIn job history looks like it was plotted by an inebriated kitten) and the call for writing queries is much diminished. In the intervening ten years another popular query language has arisen, courtesy of my very own company, and it’s driving me nuts.  KQL, the Kusto Query Language, is the “no SQL” language used to query Azure Data Explorer clusters, and it’s *just enough* like SQL to trick you with its wily ways and not at all enough like SQL to behave in polite society.

For the uninitiated: SQL follows a prescribed list of things you MUST do (you must SELECT something, otherwise nothing happens, you must state where that something is FROM, you must identify what the something(s) are). There are things you MAY do, but you can only do them in some places: you SELECT first, you then indicate WHAT you selected, then WHERE it is from, you may then GROUP, indicate if it is HAVING a condition, and/or ORDER your results.  Barring some fancy stuff you can do like linking up Data Source A to Data Source B in certain ways, and some rules about what you can do with the stuff you’ve selected, that’s it.  Simple, refined, elegant, transactional, neat, orderly. If SQL were a desk there would be nothing on it, and all of the papers are filed in neatly-labeled folders, and all of the pencils are in the correct drawer and sharpened and facing the same way.

KQL is your stoned college roommate’s nightstand which serves as a desk (nightstand, dinner table, etc.), piled with papers in any which way, but *somehow* they are able to retrieve *exactly* the term paper they need to turn in, right now, because they “just know”. There is *barely* any structure – you start by simply naming the first data source you’re pulling from (no Select, no indicator that that’s what you’re doing, you just say your TableName). Then each subsequent thing you want to do is marked by a pipe |; which I guess is fine.  From there on out, though, the rules are pretty wishy-washy: do you want to filter out things first? Sure, put a “where” to start.  Oh, do you want to now go pick what you want to see? OK you don’t “select” them, you “project” them — unless you’re creating calculations in which case you “extend” them, unless you intend to group them in which case you “summarize” them.  And you can do those in any order, multiple times, throughout your whole query.  I mean, you can literally have a query that STARTS with a where statement and ENDS with a select, with five other where statements and a whole splattering of calculations in between. Where is the elegance? Where is the neatness? Where is the order and preservation, I ask you?

Proponents of KQL will be quick to point out that this flexibility offers you the ability to pre-filter a ginormous data set in advance of the things you want to select and calculate, meaning the machine has to do less work (it only has to calculate the things you want it to and not necessarily all the things in the data sets you’re extracting from). Hogwash! When it was my day we calculated all the things or we created subqueries and it worked just fine! Besides, if your query is so cumbersome you’re probably not using indexes properly and should optimize your queries.  Why should *we* have to be punished into using some newfangled query language because *you* want cheap data?

I could, as I believe you understand now, rant and rave about this for hours. One of my very favorite work friends had to listen to me mention how much I do not like this language repeatedly, to the point that it’s a snicker from him when I say it in meetings (I still feel like sending it to him in instant messages on occasion). I won’t give it more space here, because I’ve said it and it’s out there.

I recently (well, about four months ago) took about six hours and studied KQL.  Armed with a “conversion” doc and five or six real, pragmatic queries I needed to write, I drilled through until I got the hang of it.  I can query in KQL, it is the preferred language for the majority of datasets I care about these days. I can speak this language enough to make my way about the country and transact business; it is not my native language and I still do not “think” in it. The point though is things do move on, and as uncomfortable as it is, I needed to learn this new thing. I don’t have to like it, but I do need to be able to understand it. And I can.

Even if it drives me nuts.

(Post-publish edit: This was originally written on 14 December 2020.  On 6 January a whole passel of people went to the capital and what started as some form of protest turned into something much, much worse, and people died. I’m leaving the post as-is, because I do not believe all conservative-leaning political folks are spoken for by those who were at the capital that day. Note that politically, if it’s important to you to understand my motivation or writing, I lean left.)

That’s How it’s Done

I use Flipgrid to consolidate inbound tech and economics news; along with a few podcasts and my weekly Economist that represents the bulk of my news media intake.  This time of year it’s a particular minefield, of course, with politics. But for the most part it’s my regular vegetables of tech and economics that get me what I want to know.

I was reading an article about how Amazon is launching an Alexa service for property management — e.g., the property manager pays for/owns the Alexa that lives in the residence with the renters, using it as a de-facto localized presence to control smart home things and, essentially, as an “added service/feature” of renting the place. (So much as you’d look to see if there was that extra half-bathroom or if there was a walk-in closet, you’d see if they included Alexa, too).

For the record, I read articles, because a pet peeve is when you get the poster who forwards an article that they clearly haven’t read (e.g., using the article to make a point that the article actually counterpoints). This is a case of me reading two separate articles, coming to a conclusion, and that conclusion was wrong.  It’s a better case of a colleague gently educating me.

Firstly, to the other article.  Granted, this NYT article is about a year old but we all remember the news that made the rounds about how Alexa is always listening. It’s true, she is: she *has* to.  Obviously she can’t start your timer or add your biodegradable pet waste bags to your Amazon cart if she can’t hear you.  In the NYT article, it’s about what she has done, and where that data goes, once she hears you. There is a sentence from that article, however, that did not stick in my brain from last year, so when I read the TechCrunch article, I made a comment on Twitter/Linked In.

My comment, quoted, is here:

“Two things: 1. interesting way to make IoT accessible to a broader base and 2. I would not at all be reassured the data is truly deleted (and isn’t, say, shipped off in snippets for “logs”/“troubleshooting”, for example). Also, the hand waving over who’s data it is needs to stop. Alexa has to listen to everything in the first place to trigger on her name.”

For the record, I still think #1 is true, and most of #2 is still an open question for me. I’m not at all clear on what happens to the data (yes, deleted at the end of the day, but… is it? What part of it is deleted? Is it every command, every call; or for example is there a record still in the smart thermostat (or a downstream reporting service) of all the changes I made, for example? And so forth.) Or who owns it (e.g., if something happens in the home, and the home belongs to the property manager, and the Alexa belongs to the property manager, but I’m the one renting the home, is that day’s data mine or the property managers?)  However, this post is to talk about someone who reached out to address the last point:  “Alexa has to listen to everything in the first place to trigger on her name.”

Now, it’s true that she does have to listen. However, a generous colleague reached out — privately, via LinkedIn messenger — to reassure me that Alexa does listen in for her name, but that listening happens only on the device… she doesn’t “trigger” until she hears her name, so no data leaves her until she does.  Or put the way they put it (bold is mine):

“Wake word detection is done on device in a closed loop, that is no audio sent to Alexa (aka. the cloud). Only when the on-device model detects the wake word with a high confidence, the audio of the wake-word it sent to the cloud for additional verification (besides false-positives this handles for example “Alexa” being said in ads).  No audio is ever sent to Alexa without a visual cue (the blue light).”

(Incidentally, the NYT article has this in a sentence that didn’t stick in my brain at all (bold is mine):

“…it’s true that the device can hear everything you say within range of its far-field microphones, it is listening for its wake word before it actually starts recording anything (“Alexa” is the default, but you can change it to “Echo,” “Amazon,” or “computer”). Once it hears that, everything in the following few seconds is perceived to be a command or a request, and it’s sent up to Amazon’s cloud computers…”)

I wanted to share my colleague’s message because *this is exactly how it is done, folks*.  While I would’ve been just fine with them pointing this out as a comment to my LinkedIn post, they’re being polite and careful, because not everyone would be and frankly, they and I had one lunch at one time and that’s about all we know of each other.

My larger point — because I know that not everyone is in to public correction and many could find it disconcerting — is that we need to be better at private correction, at accepting new data, and at assimilating it or at least making the sincere attempt.  You will read articles and they will be carefully constructed on the part of the author — either attempting to be scrupulously fair or attempting to sway you one way or another — but what you don’t get to see is what was omitted, either via editorial jurisprudence or a required word count or assumed common knowledge.  What you don’t get to realize is what your brain has omitted, either via convenience, or simply the wear of time.

So thank you. I happily sit corrected :).

Stolen Identity and Next Steps

Well, it’s finally happened. Some enterprising twat has used my identity to do something naughty and it’s causing no small amount of consternation.

Like many in Washington, my information was used to file a false unemployment claim.  Some pseudo-human got hold of my social security number and my email, went to the ESD, and said they were me and that I was unemployed and “I can haz money now?”  I heard about this from my employer, who wanted to know if I really had filed for unemployment, while still employed.

  • Of course I couldn’t concentrate on anything after reading that email.
  • Of course I went and put a credit freeze with all three bureaus.
  • Of course I changed all my passwords.
  • Of course I filed this as a fraudulent claim with the ESD.

There’s a couple more things I didn’t realize I should do (that I have since done):

  • I have filed a police report (this can be done online!).
  • I’ve documented it with the FTC.

Going through all of this is a hassle of course, and on top of other things right now it’s quite unwelcome. Here’s the thing: I have resources, and time, and a really great employer who identified it and let me know it was happening, along with specific guidance on what to do next.  Given the size of this fraud (there’s thousands of fraudulent claims for state of WA right now) there are literally thousands of people dealing with this, and not all have time to deal with it or guidance to deal with it. So, if you or someone you know has discovered some sort of identity fraud, here’s some links and things to do:

  1. Put a credit freeze (free to do, and can be done online) on your credit with Equifax(yes, that Equifax), Experian, and TransUnion.
  2. File a fraudulent claim with the entity that was defrauded (in my case, it was the Washington state employment office– and it was filed online)
  3. File a police report (also online, non-emergency).
  4. Document it (online!) with the FTC.
  5. Call (or email, or go online) your banks and let them know, so they can guard on their end.
  6. Change all your passwords and/or your password algorithm.

Will this make you bulletproof to future fraud? No — shit can still happen. (Murphy’s Law is a law for a reason). No sense in making it easier for the assholes that do this.

Hustle: How to Get Things Done

In Empire Records, Liv Tyler’s character is this seemingly perfect human who is a straight A student, cool, works in a record store, and gets a lot of things done. When her friend comes to pick her up for their shift at the store she’s got fresh-baked cupcakes and her friend marvels at her productivity: her answer is that there are 24 useable hours in a day. (Sure, later on we find out she’s been on amphetamines but we all know someone like this who isn’t. Or probably isn’t.)

Increased productivity is an economic expectation (and/or desire) for a given population but it’s also an expectation we put on ourselves, and our kids, coworkers, volunteers, etc. The “always busy” culture celebrates the hyper-productive person who, when you ask them how their day was, will inevitably reply “busy”.

In my career (which sounds really great as a tag for a series of only vaguely tethered job choices) I have developed a set of practices to live in that world and get a lot of things done. While it’s true that there’s no such thing as multitasking you can learn to recover from switched contexts faster, when to shove the ball into someone else’s court, and how to pursue the answers you need (to unblock your course of action) doggedly.

Getting Someone to Respond

Most offices work in an email-enriched environment (maybe too enriched) for primary communication.  Some have Slack or Teams as an augment or replacement. Then there’s meetings and conference calls.  Within these, there’s usually the need to either disseminate information and the need to acquire information. Getting someone to respond is the need to acquire information: either to get them to acknowledge a given topic or to provide a missing piece of data so you can go about your day. Example: I need to know if there already exists a security protocol/practice on a system I’m thinking about using. I’ve read the provided documentation* and still don’t have an answer.  At this point I reach out to the name responsible for the documentation (or the name responsible for the product, or indeed anyone I can find related to it) and send an email or Slack@. When the inevitable non-response occurs (email is good for that), I set a meeting.

Why?

Because people hate meetings. It’s a massive disruption, they’re stuck on the phone or in a conference room when they could be doing something else, and it means they’ll have to (gasp) talk to you in real time.  The reason why texting has taken off and voicemail is dead is because, for the most part, people don’t actually want to interact with you unless they have some social basis for it.  By creating a meeting and pushing the point it gives them one of three options:

  1. To unblock you by responding to the meeting request/your original email and giving you the data you need or some other poor sop to go after.
  2. To actually meet with you, in which case you get not only the answers you’re after but you can pelt them with more questions.
  3. To ignore your meeting request.

For that last: it does happen, but rarely.  When it does, and *if you’re truly blocked*, you request a meeting with their lead.  At some point up the chain, meeting requests and emails can’t afford to be ignored.  This is a somewhat nuclear option, so use sparingly.  You can also branch out and forward the meeting/email to others in the same group/product.

Carving out Time

This may seem silly, but actually carving out time on your calendar (“booking yourself”, as it were) will make sure you have the unblocked time you need to get whatever-it-is done, and that you don’t accidentally overlap incompatible things.  I can clear out my email while dinner is in the oven, and I can go for a run on the treadmill while listening to a podcast, but I can’t clear out email while listening to a podcast (because the brain gets confused). Some folks use this to actually make sure they remember to eat (e.g., “lunch” as a 30-minute block) and some folks do this so they can catch up on training or get focus time to diagram something out. Bottom line: book your time, because if you don’t someone else will.

Also, this includes personal stuff: I have calendar time carved out for housecleaning, for laundry, for grocery shopping, for trimming the kitten’s nails, for blood donation, etc. It keeps me straight. Sure, I could try to keep it all in my head, and I used to try to do that.  In 10th grade I double booked a friends’ house sleepover (super-rare for me to get to do those back then) and a babysitting job.  I was devastated because I had to do the job (you do what you say you’re going to do. Period.)  Keeping it written down reduces unpleasant double bookings.

Finally: carve out time to do nothing.

That’s right. Do nothing. Give yourself a night a week if you can afford it. Block it off so it can’t be consumed by other things (unless you really want it to).

Prioritize your Backlog

In the Hyper-productive Expectation World, you will always have more to do that can be done. Always. There’s not enough caffeine, amphetamines, or hours to accommodate everything.  You can either ruthlessly trim things (which is very effective but requires a strong will to say “No” sometimes) or you can prioritize things (which means you still have them on your list, they’re just much farther down).  Look at the Volume of Stuff, and figure out which are most important to least.  Some things will be of related importance (you can’t do A until you do B, but A is really important, so get B done now) and some will be compatible or a two-birds-one-stone situation (I can walk at an incline on the treadmill and read that latest set of whitepapers). I recommend having prioritized lists for Work and Non-Work (and if you have other commitments — PTA, Scouts, Church, Nonprofit, Clubs, etc.– prioritize within those).

Use Technology To Help You

Use your calendar and reminders. Use a list/task tracking app. Use OneNote. Use the alarm on your phone. Use sticky notes. Use whatever works for you to remind you if/when you need to do stuff and what it is.  For example, we have a running One Note grocery list broken out by the stores we use (because Trader Joes doesn’t have all the things and Costco doesn’t either). We update it through the week.  I have an Outlook task-tracking list of the things that are most important for a given week. My friends use a Trello board to organize household responsibilities and projects.  Another friend uses their inbox to prioritize.

The thing to determine here is what set of technologies work *for you*, because some folks like to leverage their mobile for keeping their brains straight and some people prefer tactile things like sticky notes and highlighters.  There’s no one *right* way, just the way that works for you.  You may have to try a few things before you hit on the right combination.

Eat Your Frogs First

In any prioritized list of things to do, there’s the thing you don’t really want to do but have to do.  Maybe it’s the cat-pan change out. Maybe it’s reorganizing under the bathroom sink.  Maybe it’s collecting all of the papers for your tax return. Maybe it’s going line by line through an excel spreadsheet until you find that the issue with line 943 is in fact that the value that should be a decimal was in fact a text and it broke your import. You know, that thing.

Do that thing first if faced with it and another 3 things of the same priority. You’ll get it out of the way, the other things will feel (and be) easier, and you’ll feel all kinds of virtuous.

Wash your hands when you’re done, though.

 

This is Going to Hurt You More than Me

Greetings from the ending of a self-imposed blogging silence: I got the aforementioned email and am happy to state that I will shortly be joining Microsoft.  Sur La Table was very diverting and offered many challenges with respect to data, but it’s hard to pass up an opportunity to work in, and with, big data.

As a result of that interview loop, plus some interviews I did for an open position we have at Sur La Table, I’m here to write something Very Important: Don’t Lie on Your Resume.

Typically when I am called in to conduct a technical interview, I read the candidate’s resume, and then ask the hiring manager how technical they want me to get. If it’s me, and I’m hiring for a developer, I’m going to get very technical, and you’re going to spend 100% of your time with me at the whiteboard. If it’s for someone else, and I’m hiring for say, a PM, or a QC, or technically-minded-but-not-otherwise-a-developer role, I’m still going to test you on skills you state in your resume.

So when you tell me that you have a lot of experience with SQL, or that you’ve been using SQL for five or six years, I’m going to run you through the basics. Either of those statements will tell me that you know the four major joins, you know the simplest way to avoid a Cartesian product, you know how to create data filtration in a join or in a where statement, and you know how to subquery. I’m not even getting to more advanced parts like transactions with rollbacks, while loops, or indexing — the aforementioned list are what I would characterize as basic, everyday SQL use.

Imagine my dismay, then, as an interviewer, when after declaring (either verbally or on your resume) that you are a SQL expert, you can’t name the joins. Or describe them. Or (worse) describe them incorrectly. When you say you know SQL, and then prove that you don’t, it makes me wonder what else is on your resume that you “know”, that is less hard to prove (in the interview) that you don’t. The default assumption, for the protection of the company, is that your entire resume is a raft of lies. It’s the surest way to earn a “no hire”.

It would have been far better to state the truth: someone else wrote SQL scripts for you, told you what they did, and you were adept enough to figure out when there was a disparity in the output. That does not mean you “know” SQL, it means you know how to run a SQL script. This gives the interviewer an honest window and the ability to tailor your time together (remember, they’re getting paid by the company to spend time with you, if it’s productive it is not a waste of money) to figure out your strengths and weaknesses. Having just been hired into a position that works with big data, where I was honest that the largest db I have worked in and with was about 3TB, I can attest that it’s really hard to have to look a hiring manager smack in the eye and say: “I have 90% of what you have asked for but I’m missing that last 10%”. It gives them the opportunity, however, to decide if they’re going to take the chance that you can learn.

If they’ve already learned you’re honest, then that chance-taking looks better in comparison.

In Development

I was at a holiday gathering the other day and during the usual course of “…And what do you do?” I replied that I was a developer. The inference was that I was a Real Estate Developer; I had to explain that I was a Make the Computer Do Useful Things Developer. I was talking to two ladies about my age (Hi, I’m 40), and was surprised at the reply: “Oh, that’s unusual!”

I suppose I should not have been. I know a lot of women in IT, but darned few who do development.  To be clear: most of the women I know in the Information Technology space were at one point developers, or have a passing knowledge of some development language. They merged into Project or Product Management, or Business Analyst roles. These roles require knowing what is possible of code without actually having to write any of it, and so if you get tired of the incessant progress of development technology then that is one way up and out (and it is a way I took, about five years ago).

Careers arc and opportunities knock and itches flare up and I am once again a developer.  And I find myself, when talking to people who don’t work with or know other developers, battling not only the usual misconceptions about development, but the gender-based ones as well.

Development (in IT terms) is the handle one applies to the concept of using a series of commands (code) to tell the box (tower, laptop, server, etc.) what you want it to do; if you want it to take in something or not, if you want it to spit out something or not. In order to create this blog post many people did varying forms of development (from creating the templates that instruct the browser how to make this post look all shiny, to the protocols that tell the server where to put this post, to the widgets on the front end that tell you things like I haven’t posted in a while). If I typed it in MS Word, that required a bunch of other development by a bunch of other people.

Development is not:

  1. Something you can do on five screens drinking 3 bottles of wine to create a “worm” that appears as a graphic on your screen (as in Swordfish), and usually doesn’t involve a developer logging an Easter Egg of themselves in a bad Elvis costume with sound effects (as in Jurassic Park)*. If I drank 3 bottles of wine and was looking at 5 screens they’d probably be the ones you see in a hospital room, and the only graphics I would see appearing would be the “worm” that is my heart rate monitor flat-line.  And while I have myself buried Easter Eggs and commentary in code, it isn’t that elaborate because you don’t typically have time to build elaborate things. You’re busy rewriting all of the stuff you just wrote because someone decided to change the scope of your work.
  2. Anything involving a graphic user interface (GUI). When a developer talks about manipulating objects, they are things that are typed out phrases, they are not boxes that are dragged and dropped. There are some development environments that offer up a GUI in tandem with the “scripting” – that bit about writing out words I was talking about – but they are there to illustrate what you have scripted more often than not, and not there to assist in your scripting.
  3. Finite. Development technology is constantly changing and no one developer knows all of the development methods or languages. That would be like someone knowing all of the spoken languages in the world. Rather, it’s typical you’ll find one developer who “speaks” one development language really well, or maybe a branch of languages (much like you run into a person who can speak Spanish and French and Italian, because they are rooted in the same “base” of Latin, it’s not uncommon to find someone who can code in ASP.Net and VB.Net and C#.Net, because they’re all of the Microsoftian .Net base).  No one hires “a developer”, they hire a .Net Developer or a Java Developer or a Ruby Developer or what have you. Specialization exists because the base is so broad.

Modern cinema has done an injustice to developers in terms of making what we do seem both simple and sexy; the “shiny” environments typified by the interfaces “hackers” use on-screen looks really slick and probably took some real developer hours of time to make look good… with absolutely no real purpose. That said, actual development can be simple (with clear requirements and a decent knowledge of the things you can and can’t do) and can be quite sexy (if you’re sapiosexual). It’s just not well-translated in current media. (To wit: Jeff Goldblum uploaded a Virus to an alien system on a Macbook. He didn’t have to know the alien system’s base language, machinery, indexes, program constraints, functions, etc. And it was on a Mac, in the 90’s, for which development was not one of its strengths).

Most of what development is, is trying to solve a problem (or two), and generating endless logic loops and frustrations along the way. You build a “thing”, you think it works, you go to compile it or make it run, it fails, you go dig through what you wrote, find you’re missing a “;” or a “,” or an “END” or a “GO” or a “}”, re-run, find it fails, and go dig through some more. For every hour you spend writing out what you want it to do, you spend about an hour figuring out why it won’t do it.  This process of “expected failure” is not sexy or shiny or ideal, and that’s why it doesn’t show up on-screen.

These are misconceptions every developer, regardless of gender, has had to deal with at some point. Some deign to explain, some gloss over, some simply ignore; much like I really hope we get a socially-functioning, intelligent person on-screen soon, so do I hope that we get a showcase for the simple elegance of real development.

It would be great, too, if there were more female developers on “display” as well (and not for their bodies, hence the scare quotes).  Think through every movie you’ve ever seen that shows people doing any real development, “hacking” even (a term that is abused beyond recognition); how many were female? Go back to the movie “Hackers”—did Angelina Jolie actually, ever, really type anything? You inferred that she did, but the real development, the real “hacking”, was done by the crew-of-guys. Oh, and that’s right, she was the only girl.  The Matrix? Carrie Ann Moss spent precious little time in front of a computer there. She did look damn good in skin-tight leather.

Fast-forward a decade (or two) and we’re pretty much in the same boat. You see women behind computers on-screen, but they are typing in word processing programs or moving the mouse to click it on the shiny picture of the Murderer/Prospective Boyfriend (or, you know, both). They aren’t buried under a desk trying to trace a network cable or eyeballing multicolored text trying to figure out *WHY* it won’t compile, they’re delivering the shiny printout to the Chief/Doctor/Editor from which Decisions Will Be Made.

We find it surprising in social circles, I suppose, for women to be in development, because we don’t see it exemplified or displayed in any of our mediums.  TV, Movies, even proto-development toys for children often feature eager-looking boys interacting with them, the girls are reserved for the beading kits and temporary tattoo sets (actually, there’s precious little out there for getting your child, regardless of gender, to learn code, but that is changing). We have crime-solving anthropologists, we have NCIS ass-kickers, we have cops and coroners;  maybe it’s time we had a developer.

*Jurassic Park is a good example of both great and poor development display. Right before tripping that “Dennis Nedry Elvis Graphic”, Samuel L. Jackson’s character is eyeballing Nedry’s code. That stuff that looks like sentences that don’t make sense? That’s code. That’s what it looks like, for the most part. Unfortunately, later on when the little girl is hacking the “Unix System” that “she knows”, it’s all graphical. And that’s not accurate.

Plus One To Awareness

Yesterday 10pm local time ended my 24-hour vacation from any sort of connectivity (including the ability to “google” anything, text anyone, etc.) If you think it’s simple, try it in a place as connectivity-savvy as the Magic Kingdom. There’s an app to navigate the kingdom that includes line times, parade routes and hidden Mickeys. I couldn’t download or use that, no phone. There’s free wi-fi in the hotel and in the parks. Nope. In a line for Space Mountain where every 3rd person is lit from beneath (thanks to their iPhones and in a couple of cases, iPads), connectivity sure would provide an answer to the waiting game.

When I turned my phone off I made an analog list (pen, paper) of all the things I’d use connectivity for if I had the ability to, and the time.

  • At 11pm that night, finding it difficult to fall asleep and devoid of reading material (I had finished it), I really wanted to read my twitter feed to fall asleep, but I didn’t.
  • At 3am I wanted to look up the symptoms of food poisoning (yes, it was), but I didn’t.
  • At 9am the male child asked if he could bring his DS into the park to keep him occupied, and when I incredulously turned to him to explain the whole park was designed to keep him occupied, and discovered that he was teasing me, I really wanted to tweet it. But I didn’t.

And on it went. In the line for Space Mountain I wanted to share the statistical correlation between a person with an iPhone and a lag in line continuity, I wanted to look up the name/number of the restaurant we are to eat at tonight, I wanted to check the terms of the Disney Visa and see if it really was the good deal it was purported to be.

But the thing that really got me was pictures. I couldn’t take pictures.

Pictures of the male child when he finally got his sword (it’s impressive), of the lush greenery that would exist just fine here without the careful maintenance it gets, but would die in two weeks outside in Washington, of the attention to detail this park gives to its art and architecture. “The floors here are *really clean*,” the male person said, as we trotted along in line at Space Mountain. (This was fortunate for the teenager in front of us who, when the line stopped, would sit down on them. Just plopped right down. Even if the line moved again, and then she’d try to scoot along on her ass. Ridiculous, naturally.) It became a challenge to find something out-of-place anywhere.

Therefore, today, fully connected, app-in-hand, there will be pictures, and tweeting, and tweeting of pictures, and Foursquare check-ins, and more pictures.

PS  – for those wondering, my personal email for a 24-hour period counted 74 including advertisements, and 2 for legitimate communications. My work email counted 14, of which 8 were things that were not about me and completely resolved before I got online, 2 were social (one going away notice, one lunch notice), a meeting change notification, and 3 legitimate to the project I was working on.

PPS — Grog the Luddite would like to mention he’s really a sensitive, un-macho, really into stopping and smelling the roses guy and likes technology just fine and even knows a thing or to about it, he just wanted me to realize that there was life outside of it. Point taken.

Dabble, Dabble, Toil and Babble

“Your biggest problem”, he stated flatly, “is you’re a dabbler. You don’t specialize in anything. You are not going to succeed because you do not focus on a given talent; you just dabble in this and that.”

This was actually stated, to me, in a 1:1 with my boss at the time. He was a financial services guru and I was his personal and executive assistant, so assigned because I was technically inclined and could type fast. In short, I was good enough to be his e&pa because I dabbled.

Despite initial reaction, this was meant to be a positive speech: it was going to Incite Me To Action and I was going to Make Something Of Myself. Instead, I quit the job, moved back home, and dabbled some more.

I dabbled my way into SQL.

Then I dabbled my way into ASP.Net. Then I dabbled into VB.Net.

Then I dabbled into SQL some more, and into project management. And the dabbling continued, through business development, communications, operations, and back into development (but C# this time).

“Which one of your degrees does this job come from?” wondered my stepmom one night in Spring when I told them I had acquired this one. “None of them!” my dad said wryly.

My old boss is correct: I am a dabbler. None of the things I have done, have I truly specialized in. There are better people at SQL out there than I am, there are certainly better people at .Net and BusDev. But there are damned few who can speak those languages and are willing to translate them, painfully, carefully into shiny PowerPoints and ROI-laden SWAT analyses.

A few months back I had my midlife crisis, it lasted 36 hours and was of the vein  of “what am I DOING with my life? Where will I go next?” And I realized that every other time in my life I’d been faced with that question things unquestionably got better, more exciting, and more rewarding.

I have friends who went to college for what they ended up being in life, they seem happy and fulfilled. I have friends who picked a field and stuck with it, and will have a decent retirement to speak for it. My own parents offer four different examples of picking a road and trotting down it come hell or high water and they’ve all done fine.

I do not believe, though, that diminishes any success by a diagonal route.

Owning Your Data

I realize I’m terribly late to this party. I’m not even fashionably late, I’m “you arrived just as the caterers were cleaning up and the hostess had taken off her shoes” late. I’ve been busy (as, I think, I’ve amply covered).

However, I really must say a word or two about Reinhart and Rogoff.

For those who don’t follow economics or kinda remember they heard about it but aren’t sure what the big hullabaloo is, I recommend you google it; look for the Economist, the Guardian, and the Atlantic non-editorial resources to start. There’s a few. Then you can go off to the editorials for dessert. For those who don’t want to google, here’s the Twitter version: Two economists present a work in which they suggest that there is a deep drop off in economic performance without austerity measures. Essentially they said that when debt is high, growth slows to a grinding halt; the graph they presented roughly resembled the cliffs of Dover.

And it was wrong.

Because of an Excel spreadsheet formula error.

Normally this wouldn’t be awful. Anyone, and I do mean anyone, who has used Excel to convey data (or volumes of analysis) has made that spreadsheet error, and it can be as simple as not properly conveying a Sum formula, or as complex as messing up your Vlookup in your nested IF statement. Excel has been bastardized over the years into an analytics function (by courtesy of default in that it’s on nearly every machine) that it really can’t fully accommodate without failsafes; EVERYONE makes an Excel error.

Reinhart and Rogoff’s mistake is NOT that they made a spreadsheet formula error. And, contrary to the article above I linked to, it’s only partially that they did not peer review.

It was governments’ (plural, many, varied) mistake to use it to shape policy.

Lookit, suppose I told you that, according to my Excel spreadsheet, you were very likely to die from dehydration if you didn’t eradicate all but 0.4 grams of salt per day from your diet. For perspective, the average diet has about 5 times that. You would very rightly look to other studies, other data, other sources of information. You’d poll your neighbors. You’d check with friends. You’d do your due diligence before you used my say-so, no matter how shiny my Excel spreadsheet, or even how shiny my MD would be (this is fiction, after all).  Plenty of people are told by their doctor to lose 10lbs because it will make a difference in the long run, and plenty of people seem to blithely ignore it because they don’t have corresponding (personal, attributable, anecdotal) data.

So why, why, why did any government, financial body, fiscal institution leap on the screeching panic train when R&R’s study hit?  Why did no one look to a 2nd opinion, a different study; why didn’t they check the data for themselves before subjecting their economies to the fiscal equivalent of a rectal exam?

I have been in data now for 15 years. It’s not a long time in the scheme of things, but it’s something I’m known to be passionate about. I can go on and on about how data works, or doesn’t; what you can derive from it; how data *is* integrity if done right. Any form of analytic reporting that is worth its salt has been tested, peer-reviewed, and validated against two or three other methods before it is used in a practical space. At Expedia, at one point, I managed 500 ad-hoc requests per month, and each of those was eyeballed against existing reporting and a decent sense-check before being used to cut deals (or not).

Now, please understand: R&R screwed up. And, apart from their formula error, they insist the outcome is the same (and it is, but it’s the equivalent of saying “ok it’s not a steep drop off anymore, more of a speedbump, but still it’s a delta!!”). This is the foible of the data monkey; again, something we’ve all been prey to. But not all of us have done it to the culpability of large (and small) governments, and most of us have learned to admit when we’re wrong. That is the crux of it: if no one is perfect, no data is perfect, to pretend yours is against evidence to the contrary is specious at best and negligent at worst.

I argue though that the more egregious mistake is to *follow* that data without validation. To quote Ben Kenobi: “Who’s more foolish, the fool, or the fool that follows him?”