Category Archives: Uncategorized

What I Learned About Networking

Last night, I attended a session at CTO School, a New York City based meetup for current or future CTOs (I fall into the latter group). The theme of last night’s set of presentations was networking, which was helpful to me, because it’s not something I’m good at. Here are some things I learned in particular.

 

Not about acquiring acquaintances

Networking is not about shaking hands with every single person in the room and connecting with them on LinkedIn. Instead, it’s about forging relationships that are valuable to both people.

Provide Value

Many of the speakers discussed both why you need to provide value to others without expecting anything in return and how to do that. Most of the ways are pretty simple. Here are a few examples:

  1. Comment on someone’s blog
  2. Give feedback to their product
  3. Make introductions between people
  4. Send the person content he might be interested in.
  5. Seek opportunities to help by following blog, on twitter
  6. Present talks
  7. Host/sponsor meetups

Develop a Plan

I’ve known that I need to network, but often been frustrated because it never seemed to have a direction. By establishing goals, figuring out who can help achieve those goals, and then figuring out how to meet those people, networking has more of a direction.

Use Tools

If you do plan, than there is a lot to keep track of: people to reach out to, when you last reached out to them, how you know them, and what things interest them. The speakers mentioned some valuable tools for managing this process:

  1. Streak Browser Plugin
  2. Followup.cc
  3. CardMunch (only for iPhone, but apparently there are similar tools for Android)
  4. Contactually
  5. and of course, spreadsheets

Importance of Soft Skills as a Leader

Given my career goals, one of the best lines of the night was from Sameer Sirdeshpande. He said that the more successful you become in your career as a technologist, the less your tech skills count for something. That doesn’t mean that they aren’t important, but it does mean that you are expected to do a lot of other things as well.

It’s about Learning

When networking, you shouldn’t be asking for a job right away. You should listen to people. You should find out about what their interests are. You should be seeking common ground. It is with these things that we forge relationships with each other, and it is only after we have a relationship that we can help each other.

Agile Writing – What Our Teachers Didn’t Tell Us

I recently came upon a post How to Write Like Your Teacher Told You Not To that discusses many of the style differences between how we are taught to write in school and how to write blog posts. There are some good points in the post, but there’s something else the post doesn’t address: the process of writing.

The academic writing process I learned in school focused on planning as a part of writing – identifying the subject matter, a thesis, and doing an outline before starting a first draft. In the real world though, we generally don’t write that way. I could use my writing experience as an example, but a more authoritative example is that of David McCoullough (from an interview with Charlie Rose, 48:45 in to the interview):

As soon as your start writing, you become aware of what you don’t know, what you need to know… When you write, you suddenly have ideas, or insights, or questions, that you wouldn’t have if you weren’t writing.

The point is you can’t have a plan until you’ve done a rough draft. Without the rough draft, you can’t know if the subject material is too big; how much background reading you’ll need; whether your thesis is tenable; or whether you even care about what you’re writing about in the first place.

This is another example of how the Agile approach to work better handles uncertain creative processes. Plans made at the beginning do not benefit from knowledge accumulated later in the process, and so a robust creative process requires some phase of hands-on, in-the-trenches discovery before a meaningful plan can be produced.

What You Should Use Code Metrics For

There are several tools for measuring different aspects of code quality – PMD, Emma, Clover, Checkstyle. There are different tools that aggregates one or more metrics into combined views or composite metrics including Maven Sites, Crap4j, and Sonar. There are lots of good tutorials out there for how to setup these tools, so I won’t repeat any of those. What I haven’t found much on is for what purpose a team is supposed to use these tools. There’s a lot that goes into it, but here’s my attempt at a short summary:

Measure Improvement, Don’t compare to some other project

It’s tempting to compare your project to another project and conclude you’re doing a good or bad job based on having a better or worse score for a particular metric. In reality, comparing two projects is dangerous because different programming languages or requirements may inevitably lead to different values for these metrics. For example, a piece of software dealing with billing or commissions is likely to be much more complicated than code for user profiles.

Instead, what you should compare your metrics to is your code metrics over time. Code metrics can reveal how impactful a refactoring was (by reduced cyclomatic complexity) or how well the latest test driven development training stuck with the team (by measuring number of tests, average lines per test, or code coverage percentage). Looking at graphs of metrics over time, especially as part of an Agile Retrospective, can be a great starting point for a discussion about whether the team’s improving.

Use Code Metrics Before Development, Not Just After

While useful in retrospectives, code metrics are also useful before adding a feature. Engineers normally talk with one another about potential issues that one might run into in a certain piece of code. A good code metrics tool also acts as an extra team member you can consult.

An example of this is when your working on a part of the code base you’re less familiar with. Using code metrics can give you a sense of what classes and methods are going to need refactoring before touching. This can lead to much better estimates – knowing before hand that clean up work will be needed leads to better project outcomes than finding out half-way through a project.

Code Metrics are metrics about the Team and Project, Not Individuals

There’s a temptation in setting up code metrics projects to use metrics to measure individual performance. For example, if Alice’s code has a higher cyclomatic complexity than Bob’s, a very naive conclusion is that Bob is a better coder. However, if Bob has been coding up basic CRUD operations and Alice has been working on complicated billing logic, than the metric clearly says nothing about Alice’s or Bob’s ability as coders. Instead, the cyclomatic complexity says something about the code as a whole, and an increase says something about the project and potentially about the team, but it says nothing about Alice and Bob. This goes for any other code metric: lines of code, unit test code coverage, PMD warnings, and so on.

Why This Matters

This is probably a subject for another post, but code metrics matter so much because they can bring a whole new level of transparency and objectivity to whether or not a team is improving, and constant and deliberate improvement is the hallmark of a great team. Agile teams in particular benefit from looking at metrics at least once a sprint and using it as a data point in figuring out what the team needs to be more productive.

What to Document While Coding

Engineers hate writing documentation. The problem is, writing documentation is a necessity – without it, it becomes hard for teammates to ramp up on features/components of the system and it’s hard for the engineer who wrote the code to remember what he was thinking at the time.

The problem with documentation (other than it’s not “real work!”) is that at best it becomes a time suck to keep up to date, and at worse it is wrong, and having wrong documentation is worse than having no documentation.

How can teams write documentation then that doesn’t get out of date? The best way that I’ve found is to focus on things that are always true vs things that are currently true. Here are a few examples.

Why did we make that decision?

It is always a good idea to document why a decision was made, since those reasons will never change. It might prove to be a bad decision. The reasons might prove wrong, or the assumptions behind the reasons might be flawed. Documenting the thought behind a decision lets you change the decision when the reasons for it change and lets you learn for the next time you make a similar decision.

What is the purpose of this project/class/method/line?

Say you have an if statement that checks whether a property of an object has a certain value. If later you learn that the property can never have that value, it’s safe to remove that snippet of code.

The exact same logic applies as you scale up, even as the investment scales up. A project that tries to accomplish something that doesn’t apply very often or assumed something about the market needs to be morphed or tied off.

On the flip side, documenting the purpose of something also lets you defend it. When someone asks why the team is doing something a particular way, documenting the reasons makes it easy to point out the reason for continuing something.

What assumptions are we making?

Assumptions are often a point of miscommunication. Being as explicit about these as possible helps prevent miscommunication. Assumptions might change over time, but documenting the assumptions at the time gives a nice snapshot into the mind set.

Conclusion

In general, it is helpful to think of documentation as something that builds on itself, rather than as something to change. Don’t just change documentation to represent the current state. Instead, add a new revision saying what in the external world has changed and therefore what needs to change within the code.

Advice For Aspiring Software Engineers

A few weeks ago, Columbia hosted a networking event for current engineering students to meet alumni. There were some common questions that came up a bunch, and since it was done in the “speed-dating” format it was rushed, so now that I’ve had a moment to reflect on the questions, I thought I’d present my two cents on some of these common questions.

I want to build a web application, what do I do?

First, pick your favorite language. The language doesn’t matter: Java, Scala, C#, Python, Ruby, PHP, whatever it is. Second, learn the standard stack for that language. For Python, learn Django. For Ruby, learn Rails. For Java, the stack has lots of smaller pieces, but learn Maven/Hibernate/Spring/Jetty/JSP. To learn how to build a web application, just start building it. Don’t worry about picking the “wrong” one, since the architecture of web applications is generally the same, consisting of some database (relational or NoSQL), object mapper, data access, business logic, and presentation layers.

Then, just follow tutorials. If you don’t know how to do something, Google it and start by doing the most brain-dead way you find. As you get more experience and see yourself doing similar things over and over again, you’ll learn what to invest your time in knowing fluently and what stuff you should just look up.

I have an idea for a business, what do I do?

I have to admit that I have never been a founder for a startup. However, having worked at two early stage startups and one later stage startup, I do have some experience.

The best thing to do is just start. If you have an idea, figure out how to get validation as quickly and cheaply as possible, keeping in mind that your time is one of the most expensive things you have. Some students actually already have this figured out (see for example here and here).

Inevitably with people from programming backgrounds, there is a tendency to do the building for fun. Programming is fun, but when time is precious focus on building things that are going to verify your business’s model as soon as possible. We are taught as students that failure is bad, but professionally you should seek out the quickest way to failure. The sooner you fail at one idea, the sooner you can start working on the idea that actually will work.

How do I get a job?

My shameless plug was and still is that Yodle is hiring.

Not everyone at Columbia will end up at Yodle though, so for everyone else: First, have some kind of web presence. If you’re a student, have a student page in the directory. That’s how I found my first job. Second, it is easier to find a job if you have someone soften things up for you, so build and leverage your network – professors, fellow alumni, people you met at summer internships or working on open source projects.

Another big piece of getting a job is knowing what you want to do. If you tell someone in an interview that you are applying to do very different kinds of jobs, you should have a pretty good reason about what appeals to you in each.

How do I keep improving?

The single best way to improve is to write code. Write code at work, write code outside of work. Work on challenging engineering problems. When you write code, practice with good habits: use source control, write unit tests, refactor, make it readable.

The second best thing you can do is to read. Read all you can about software engineering. Read blogs. If you are not reading, than you are missing out on how the rest of the world is improving.

Estimating The Causal Effect Of Online Display Advertising

To all my loyal readers, I apologize for the hiatus, but life happens. About four months ago, I started a new position at Yodle leading the team of quantitative developers that work on our bidding algorithm. The first few months have flown and I have learned a ton, but have been very busy and have left little time for writing.

Or for attending meetups, but I made it to a pretty good one earlier this evening titled “Estimating The Causal Effect Of Online Display Advertising,” presented by Ori Stitelman  of Media6Degrees. The main premise of the talk is that A/B testing, now an industry standard, can be costly or impossible, and it’d be nice if we could figure out causality based on something we already have lying around – our data. Of course, we are interested in display advertising at Yodle, but the use of the techniques Mr. Stitelman outlined have other applications: how do you measure what makes customers happy? how do you measure the effect organic and paid search have on each other?

He first started off by describing a methodology for doing this kind of quantitative analysis:

  1. State the Question: What is the business problem you are trying to solve?
  2. Define Causal Assumptions: Without getting into a statistical model, what is the causal relationship between events?
  3. Define Parameter: what is the parameter you actually care about?
  4. Estimate the Parameter.

Mr. Stitelman then described a few different way of estimating parameters:

  1. Inverse Probability-Weighted Estimates: a technique that weighs an event’s importance by how unlikely it was to happen on the particular example.
  2. Maximum Likelihood Estimator
  3. Targeted Maximum Likelihood Estimator

It’s obviously difficult to digest all this kind of material in a hour long talk, but the results presented show that this technique has promise.

What I have learned so far about meetup talks is that the ones that were worth attending add a whole new load of reading to your stack, and my stack is looking pretty high.

The Value of Automated Tests

Test Driven Development is a very helpful way to program. My experience has been, however, that you cannot always write your tests first; I sometimes need to get down into the weeds to understand the problem I’m trying to solve. Michael Hartl, author of a well known online Ruby tutorial, also writes about this more balanced approach.

It’s important to understand that TDD is not always the right tool for the job. In particular, when you aren’t at all sure how to solve a given programming problem, it’s often useful to skip the tests and write only application code, just to get a sense of what the solution will look like… Once you see the general shape of the solution, you can then use TDD to implement a more polished version. (Ruby on Rails Tutorial)

This less dogmatic approach to TDD opens the question: when is it appropriate to use TDD, and when do you write tests afterwards? Asked another way, for a particular kind of problem, when do you start writing tests? There are only a few different choices, it seems, for when to write a test: before you write your application code, after you write your application code but before you push it to others, or after you’ve pushed it to others and they have found specific problems with it. To know when to write tests, all you have to know is what benefit you get from the tests, what the cost is and whether you are in a position to take advantage of the benefit:

  1. Developing a test for a particular bug fix helps prevent that bug from ever happening again.
  2. Developing a test after writing the application code helps when refactoring and prevents those classes of bugs from happening again.
  3. Developing a test before writing the application code helps inform the design of the code, helps refactoring and helps bug prevention.

The earlier you write tests, it will be more work but there is also more potential value in writing them, because you get the value you would get if you wrote it later plus some additional value. You have to be in a position, however, to take advantage of that value – if you are not, then you create more work for yourself.

The first choice above is to do it as late as possible – when you fix a bug. The value that the test brings to your business is to ensure this bug will not happen again, so as long as it is a bug you care about it is worth adding an automated test. This is the latest it is responsible to add an automated test – the only way you can write tests any later is to not write automated tests at all. It has been my experience that if you do not write a test for a bug, it will break again.

The second choice does it a bit earlier. By having an automated test suite after the code is written, you not only prevent future bugs, but you can confidently refactor your code. It makes sense to start adding tests at this point in the life-cycle of your code only if this is code that will be maintained.

The third choice does it as early as possible: before your application code. You get the most potential value from writing your tests at this point, but it requires that you have a sense of how to test the application, what the client code should look like, and how the code will be used. Sometimes we know these things, and sometimes we don’t. Sometimes your boss will come to you and ask you if such-and-such is possible, or sometimes you are trying to scratch an intellectual itch. In these cases, you probably will not be able to write the tests.

When to do (and not to do) test driven development is a hard thing to explain to others. This way of looking at it is easy to remember, and also helps explain to others why test driven development is valuable in the first place.