Category Archives: Uncategorized

What I Learned About Networking

Last night, I attended a session at CTO School, a New York City based meetup for current or future CTOs (I fall into the latter group). The theme of last night’s set of presentations was networking, which was helpful to me, because it’s not something I’m good at. Here are some things I learned in particular.

 

Not about acquiring acquaintances

Networking is not about shaking hands with every single person in the room and connecting with them on LinkedIn. Instead, it’s about forging relationships that are valuable to both people.

Provide Value

Many of the speakers discussed both why you need to provide value to others without expecting anything in return and how to do that. Most of the ways are pretty simple. Here are a few examples:

  1. Comment on someone’s blog
  2. Give feedback to their product
  3. Make introductions between people
  4. Send the person content he might be interested in.
  5. Seek opportunities to help by following blog, on twitter
  6. Present talks
  7. Host/sponsor meetups

Develop a Plan

I’ve known that I need to network, but often been frustrated because it never seemed to have a direction. By establishing goals, figuring out who can help achieve those goals, and then figuring out how to meet those people, networking has more of a direction.

Use Tools

If you do plan, than there is a lot to keep track of: people to reach out to, when you last reached out to them, how you know them, and what things interest them. The speakers mentioned some valuable tools for managing this process:

  1. Streak Browser Plugin
  2. Followup.cc
  3. CardMunch (only for iPhone, but apparently there are similar tools for Android)
  4. Contactually
  5. and of course, spreadsheets

Importance of Soft Skills as a Leader

Given my career goals, one of the best lines of the night was from Sameer Sirdeshpande. He said that the more successful you become in your career as a technologist, the less your tech skills count for something. That doesn’t mean that they aren’t important, but it does mean that you are expected to do a lot of other things as well.

It’s about Learning

When networking, you shouldn’t be asking for a job right away. You should listen to people. You should find out about what their interests are. You should be seeking common ground. It is with these things that we forge relationships with each other, and it is only after we have a relationship that we can help each other.

Agile Writing – What Our Teachers Didn’t Tell Us

I recently came upon a post How to Write Like Your Teacher Told You Not To that discusses many of the style differences between how we are taught to write in school and how to write blog posts. There are some good points in the post, but there’s something else the post doesn’t address: the process of writing.

The academic writing process I learned in school focused on planning as a part of writing – identifying the subject matter, a thesis, and doing an outline before starting a first draft. In the real world though, we generally don’t write that way. I could use my writing experience as an example, but a more authoritative example is that of David McCoullough (from an interview with Charlie Rose, 48:45 in to the interview):

As soon as your start writing, you become aware of what you don’t know, what you need to know… When you write, you suddenly have ideas, or insights, or questions, that you wouldn’t have if you weren’t writing.

The point is you can’t have a plan until you’ve done a rough draft. Without the rough draft, you can’t know if the subject material is too big; how much background reading you’ll need; whether your thesis is tenable; or whether you even care about what you’re writing about in the first place.

This is another example of how the Agile approach to work better handles uncertain creative processes. Plans made at the beginning do not benefit from knowledge accumulated later in the process, and so a robust creative process requires some phase of hands-on, in-the-trenches discovery before a meaningful plan can be produced.

What You Should Use Code Metrics For

There are several tools for measuring different aspects of code quality – PMD, Emma, Clover, Checkstyle. There are different tools that aggregates one or more metrics into combined views or composite metrics including Maven Sites, Crap4j, and Sonar. There are lots of good tutorials out there for how to setup these tools, so I won’t repeat any of those. What I haven’t found much on is for what purpose a team is supposed to use these tools. There’s a lot that goes into it, but here’s my attempt at a short summary:

Measure Improvement, Don’t compare to some other project

It’s tempting to compare your project to another project and conclude you’re doing a good or bad job based on having a better or worse score for a particular metric. In reality, comparing two projects is dangerous because different programming languages or requirements may inevitably lead to different values for these metrics. For example, a piece of software dealing with billing or commissions is likely to be much more complicated than code for user profiles.

Instead, what you should compare your metrics to is your code metrics over time. Code metrics can reveal how impactful a refactoring was (by reduced cyclomatic complexity) or how well the latest test driven development training stuck with the team (by measuring number of tests, average lines per test, or code coverage percentage). Looking at graphs of metrics over time, especially as part of an Agile Retrospective, can be a great starting point for a discussion about whether the team’s improving.

Use Code Metrics Before Development, Not Just After

While useful in retrospectives, code metrics are also useful before adding a feature. Engineers normally talk with one another about potential issues that one might run into in a certain piece of code. A good code metrics tool also acts as an extra team member you can consult.

An example of this is when your working on a part of the code base you’re less familiar with. Using code metrics can give you a sense of what classes and methods are going to need refactoring before touching. This can lead to much better estimates – knowing before hand that clean up work will be needed leads to better project outcomes than finding out half-way through a project.

Code Metrics are metrics about the Team and Project, Not Individuals

There’s a temptation in setting up code metrics projects to use metrics to measure individual performance. For example, if Alice’s code has a higher cyclomatic complexity than Bob’s, a very naive conclusion is that Bob is a better coder. However, if Bob has been coding up basic CRUD operations and Alice has been working on complicated billing logic, than the metric clearly says nothing about Alice’s or Bob’s ability as coders. Instead, the cyclomatic complexity says something about the code as a whole, and an increase says something about the project and potentially about the team, but it says nothing about Alice and Bob. This goes for any other code metric: lines of code, unit test code coverage, PMD warnings, and so on.

Why This Matters

This is probably a subject for another post, but code metrics matter so much because they can bring a whole new level of transparency and objectivity to whether or not a team is improving, and constant and deliberate improvement is the hallmark of a great team. Agile teams in particular benefit from looking at metrics at least once a sprint and using it as a data point in figuring out what the team needs to be more productive.

What to Document While Coding

Engineers hate writing documentation. The problem is, writing documentation is a necessity – without it, it becomes hard for teammates to ramp up on features/components of the system and it’s hard for the engineer who wrote the code to remember what he was thinking at the time.

The problem with documentation (other than it’s not “real work!”) is that at best it becomes a time suck to keep up to date, and at worse it is wrong, and having wrong documentation is worse than having no documentation.

How can teams write documentation then that doesn’t get out of date? The best way that I’ve found is to focus on things that are always true vs things that are currently true. Here are a few examples.

Why did we make that decision?

It is always a good idea to document why a decision was made, since those reasons will never change. It might prove to be a bad decision. The reasons might prove wrong, or the assumptions behind the reasons might be flawed. Documenting the thought behind a decision lets you change the decision when the reasons for it change and lets you learn for the next time you make a similar decision.

What is the purpose of this project/class/method/line?

Say you have an if statement that checks whether a property of an object has a certain value. If later you learn that the property can never have that value, it’s safe to remove that snippet of code.

The exact same logic applies as you scale up, even as the investment scales up. A project that tries to accomplish something that doesn’t apply very often or assumed something about the market needs to be morphed or tied off.

On the flip side, documenting the purpose of something also lets you defend it. When someone asks why the team is doing something a particular way, documenting the reasons makes it easy to point out the reason for continuing something.

What assumptions are we making?

Assumptions are often a point of miscommunication. Being as explicit about these as possible helps prevent miscommunication. Assumptions might change over time, but documenting the assumptions at the time gives a nice snapshot into the mind set.

Conclusion

In general, it is helpful to think of documentation as something that builds on itself, rather than as something to change. Don’t just change documentation to represent the current state. Instead, add a new revision saying what in the external world has changed and therefore what needs to change within the code.

Advice For Aspiring Software Engineers

A few weeks ago, Columbia hosted a networking event for current engineering students to meet alumni. There were some common questions that came up a bunch, and since it was done in the “speed-dating” format it was rushed, so now that I’ve had a moment to reflect on the questions, I thought I’d present my two cents on some of these common questions.

I want to build a web application, what do I do?

First, pick your favorite language. The language doesn’t matter: Java, Scala, C#, Python, Ruby, PHP, whatever it is. Second, learn the standard stack for that language. For Python, learn Django. For Ruby, learn Rails. For Java, the stack has lots of smaller pieces, but learn Maven/Hibernate/Spring/Jetty/JSP. To learn how to build a web application, just start building it. Don’t worry about picking the “wrong” one, since the architecture of web applications is generally the same, consisting of some database (relational or NoSQL), object mapper, data access, business logic, and presentation layers.

Then, just follow tutorials. If you don’t know how to do something, Google it and start by doing the most brain-dead way you find. As you get more experience and see yourself doing similar things over and over again, you’ll learn what to invest your time in knowing fluently and what stuff you should just look up.

I have an idea for a business, what do I do?

I have to admit that I have never been a founder for a startup. However, having worked at two early stage startups and one later stage startup, I do have some experience.

The best thing to do is just start. If you have an idea, figure out how to get validation as quickly and cheaply as possible, keeping in mind that your time is one of the most expensive things you have. Some students actually already have this figured out (see for example here and here).

Inevitably with people from programming backgrounds, there is a tendency to do the building for fun. Programming is fun, but when time is precious focus on building things that are going to verify your business’s model as soon as possible. We are taught as students that failure is bad, but professionally you should seek out the quickest way to failure. The sooner you fail at one idea, the sooner you can start working on the idea that actually will work.

How do I get a job?

My shameless plug was and still is that Yodle is hiring.

Not everyone at Columbia will end up at Yodle though, so for everyone else: First, have some kind of web presence. If you’re a student, have a student page in the directory. That’s how I found my first job. Second, it is easier to find a job if you have someone soften things up for you, so build and leverage your network – professors, fellow alumni, people you met at summer internships or working on open source projects.

Another big piece of getting a job is knowing what you want to do. If you tell someone in an interview that you are applying to do very different kinds of jobs, you should have a pretty good reason about what appeals to you in each.

How do I keep improving?

The single best way to improve is to write code. Write code at work, write code outside of work. Work on challenging engineering problems. When you write code, practice with good habits: use source control, write unit tests, refactor, make it readable.

The second best thing you can do is to read. Read all you can about software engineering. Read blogs. If you are not reading, than you are missing out on how the rest of the world is improving.

Estimating The Causal Effect Of Online Display Advertising

To all my loyal readers, I apologize for the hiatus, but life happens. About four months ago, I started a new position at Yodle leading the team of quantitative developers that work on our bidding algorithm. The first few months have flown and I have learned a ton, but have been very busy and have left little time for writing.

Or for attending meetups, but I made it to a pretty good one earlier this evening titled “Estimating The Causal Effect Of Online Display Advertising,” presented by Ori Stitelman  of Media6Degrees. The main premise of the talk is that A/B testing, now an industry standard, can be costly or impossible, and it’d be nice if we could figure out causality based on something we already have lying around – our data. Of course, we are interested in display advertising at Yodle, but the use of the techniques Mr. Stitelman outlined have other applications: how do you measure what makes customers happy? how do you measure the effect organic and paid search have on each other?

He first started off by describing a methodology for doing this kind of quantitative analysis:

  1. State the Question: What is the business problem you are trying to solve?
  2. Define Causal Assumptions: Without getting into a statistical model, what is the causal relationship between events?
  3. Define Parameter: what is the parameter you actually care about?
  4. Estimate the Parameter.

Mr. Stitelman then described a few different way of estimating parameters:

  1. Inverse Probability-Weighted Estimates: a technique that weighs an event’s importance by how unlikely it was to happen on the particular example.
  2. Maximum Likelihood Estimator
  3. Targeted Maximum Likelihood Estimator

It’s obviously difficult to digest all this kind of material in a hour long talk, but the results presented show that this technique has promise.

What I have learned so far about meetup talks is that the ones that were worth attending add a whole new load of reading to your stack, and my stack is looking pretty high.

The Value of Automated Tests

Test Driven Development is a very helpful way to program. My experience has been, however, that you cannot always write your tests first; I sometimes need to get down into the weeds to understand the problem I’m trying to solve. Michael Hartl, author of a well known online Ruby tutorial, also writes about this more balanced approach.

It’s important to understand that TDD is not always the right tool for the job. In particular, when you aren’t at all sure how to solve a given programming problem, it’s often useful to skip the tests and write only application code, just to get a sense of what the solution will look like… Once you see the general shape of the solution, you can then use TDD to implement a more polished version. (Ruby on Rails Tutorial)

This less dogmatic approach to TDD opens the question: when is it appropriate to use TDD, and when do you write tests afterwards? Asked another way, for a particular kind of problem, when do you start writing tests? There are only a few different choices, it seems, for when to write a test: before you write your application code, after you write your application code but before you push it to others, or after you’ve pushed it to others and they have found specific problems with it. To know when to write tests, all you have to know is what benefit you get from the tests, what the cost is and whether you are in a position to take advantage of the benefit:

  1. Developing a test for a particular bug fix helps prevent that bug from ever happening again.
  2. Developing a test after writing the application code helps when refactoring and prevents those classes of bugs from happening again.
  3. Developing a test before writing the application code helps inform the design of the code, helps refactoring and helps bug prevention.

The earlier you write tests, it will be more work but there is also more potential value in writing them, because you get the value you would get if you wrote it later plus some additional value. You have to be in a position, however, to take advantage of that value – if you are not, then you create more work for yourself.

The first choice above is to do it as late as possible – when you fix a bug. The value that the test brings to your business is to ensure this bug will not happen again, so as long as it is a bug you care about it is worth adding an automated test. This is the latest it is responsible to add an automated test – the only way you can write tests any later is to not write automated tests at all. It has been my experience that if you do not write a test for a bug, it will break again.

The second choice does it a bit earlier. By having an automated test suite after the code is written, you not only prevent future bugs, but you can confidently refactor your code. It makes sense to start adding tests at this point in the life-cycle of your code only if this is code that will be maintained.

The third choice does it as early as possible: before your application code. You get the most potential value from writing your tests at this point, but it requires that you have a sense of how to test the application, what the client code should look like, and how the code will be used. Sometimes we know these things, and sometimes we don’t. Sometimes your boss will come to you and ask you if such-and-such is possible, or sometimes you are trying to scratch an intellectual itch. In these cases, you probably will not be able to write the tests.

When to do (and not to do) test driven development is a hard thing to explain to others. This way of looking at it is easy to remember, and also helps explain to others why test driven development is valuable in the first place.

How to Fail Well

We all have that dream – you check your homework schedule, and you’ve done all your homework for three of your four classes, but then you completely forgot about that other class (for me, it’s always English class) all semester, and you are going to fail out of school. We are trained to always view failure as life-ruining events. It doesn’t have to be that way - a little bit of failure is not only helpful, but vital to learning and growing as a person. It requires some work to make failure OK, but there are three steps that help make failure valuable - making failure cheap, making the way you failed informative, and adjusting course based on previous failures.

Make it cheap

When we have “that dream,” we wake up worried that we’ve lost everything we have worked towards. It doesn’t have to be that way – there are several things that you can do on software projects to make failure fast and cheap. Some of these techniques are more technically focused, the others are more business focused, and when everyone working on a product understands the different ways to fail faster, it can only help the product development team.

At a technical level, engineers have several tools that help with making failure cheap. One such tool is source control - there are many good examples of such tools. Using source control, if I develop a feature on my own computer and I do it completely the wrong way, I can easily make note of what I have learned and back out my changes to start over again from scratch, losing only that time. Another tool is automated testing, whether in the form of unit tests or acceptance testing, which provides a way of checking the code really quickly for problems, and letting us know about these failures minutes after they happen instead of weeks. Having a way to automate builds and deployments is another crucial step in making failure easy. If a developer or tester (or better yet, a product owner) can fire up an environment with a certain version very easily, then the team generally is more willing to try things out, see what works and what doesn’t, and adjust accordingly. When deployments are expensive, you don’t experiment, and you don’t learn.

At a business level, certain business processes can make failure cheaper. The simplest one to adopt is the developing a minimum viable product (MVP). MVPs provide sanity checks for the riskiest parts of a project, and give the business a way of deciding if a project is worth continuing to fund. Another business practice that makes failure cheaper is continuous deployment. Although continuous deployment requires IT and Operations support to accomplish it, it is more appropriately seen as a business process because it fundamentally changes the way that a business looks at their product development. If the business comes up with a new feature that takes three days to implement, then with continuous deployment it can be in production in three days (instead of three months!), and the business can figure out what value the feature brings to a customer much sooner than with a traditional release cycle.

Make it informative

When I have “that dream” about failing in class, I always have the same feeling: I have no idea how I forgot that I even had that class. Failure is made worse by not understanding it, so to make failure acceptable, it needs to be informative.

Games such as Twenty Questions or Guess Who make use of the principle that to identify a solution you should ask questions that give you as much information as possible. The best Yes/No questions split the solution space in two. Take this example: in Guess Who, there are more “men” cards then “women” cards, and some of the men have facial hair. When asking the first question, it is therefore not optimal to ask if your opponent picked a man (or woman), instead you should ask if your opponent picked someone with facial hair, since this splits the population in two. The same principal applies in learning about a new software project – early in a project, design decisions (either at a technical or a business level) should be done for the purpose of learning as much as possible.

When we are learning, either by ourselves or in a team, failure can be very valuable if we get a lot of information out of the failure. Whenever there is the possibility of failure, it helps to think in terms of the scientific method – form a hypothesis about what you think is going to happen, and then build the system to test the hypothesis. The hypothesis therefore has to be falsifiable – there has to be some objective way to know if your hypothesis failed or not.

Make it history

My college track coach used to tell us it’s OK to make a mistake once, but you better not make the same mistake twice. Mistakes are good if you learn from them, but if you fail repeatedly for the same reason, then that’s something different.

There are several ways to learn from past history. In agile development, teams use a retrospective at the end of an iteration to discuss what went well and what can be improved, to establish  actionable items that the team can do to improve and and to appoint owners of those items to ensure they get done. At a smaller level, having a coach, mentor or peer give you feedback about how a particular task or stretch of time went plays a similar role.

Keeping some kind of record, such as a professional journal or a team wiki, of all these shortcomings can also help you to remember why you (individually or as a team) do things a certain way.

Conclusion

It is a fact of life – we all fail. Learning from failure is important, but it doesn’t happen magically – the teams that benefit the most from failure are the teams that are prepared to benefit from failure. A team that anticipates certain kinds of failures and makes those failures very cheap is going to beat a team that pays more to make the same kinds of mistakes. A team that fails intentionally in a way to learn more will develop a winning product faster than a team that does not. A team that puts infrastructure in place for learning from history, such as having retrospectives every two to six weeks, will improve faster than a team that continues doing things the same way it always has.

Four Principles of Effective Software Teams

Back in May, I was at a talk presented by Kiril Savino of Gamechanger. Mr. Savino discussed both the product they are building, the stack they use to build it, and the process they use in building it. He noted that there are several processes out there, but all effective software processes include the following four principles:

  1. Iteration
  2. Communication
  3. Backlog
  4. Automation

Having had sufficient time now to understand these points, here is my take on them:

Iteration

The principle of iteration is not something that is unique to software. I think everyone was taught in grade school to do a rough draft and work to improve it rather than try to do an immaculate first draft. The first draft of something will always have problems, no matter how well you try to plan before hand. To make matters worse, until you see the product, you will not know in what ways it is important to improve the work. Every first draft also has some good things in it too, and it is hard to see what works well and what does not until you have something tangible in front of you. Being able to work in quick iterations helps to maximize the good while minimizing the bad over the course of a software project.

Communication

Communication comes in two flavors: verbal and written. Written communication’s strength is that no one has a perfect memory, and by putting down what two team members agreed to, there is a decreased chance of misunderstanding between them. The obvious benefit of written communication is that there is a piece of paper with what two people agreed to. Less obvious is that the act of putting the communication down on paper forces the writer to be more precise and to think out what was said in more detail. I have found that when I write something, whether it is an email, design document or blog post, I find inconsistencies in the way I think about a problem that I don’t see until I’m staring at them on a computer screen, and writing them provides an opportunity to correct my misunderstandings.

Verbal communication’s strength is that it is interactive. Whereas a written document can potentially cover the “wrong” set of details, in a conversation one person can tell the other what he does and does not understand. No document, except for the software itself, contains every detail about how a piece of software works , and team members need to talk when they have different views about a requirement or design constraint.

Backlog

Giving the developers on a team a backlog helps them in lots of ways. From a technical standpoint, knowing what features are coming down the pipeline helps to plan what technical infrastructure the team will need and what the team needs to do to train itself. It also helps in terms of morale. Without a backlog describing to the team the direction a product will take before its release, it is not ever clear if the team is approaching the goal not. Another way having a backlog helps is that future requirements are coming from a well understood set, and are not being written the day a sprint starts.

Automation

Automation can fundamentally change how a team collaborates. For example, Mr. Savino mentioned in his talk doing automated deployments, which is becoming a common practice. Automated deployments make the feedback loop between developers and those giving feedback (for example, the product owner, QA, customers, or the CEO) much shorter, which makes correcting the software less expensive. It also makes deployments easier, since a developer or system administrator doesn’t have to take his time to manually go through an error prone checklist that might be out of date.

Automated tests are another example of automation that changes team collaboration. If a developer writes a piece of code and someone has problems with it, he has to interrupt the flow of what he is working on, take a look at it, understand the problem, and figure out if it really is a problem with his code or how the code is being used. If, however, that same developer writes a piece of code and the tests to go along with it, then when a bug occurs the team can more easily narrow down the cause of the bug.

Organize 5000 Emails in Two Steps

This morning, I had over 3000 emails in my personal gmail inbox. Some were read, some weren’t. Today, I got up the courage to do the previously unthinkable -

  1. I selected all
  2. I hit the “archive” button.

It was such a good feeling, that I did the same thing with the approximately 2000 messages in my work email. If you have several thousands of things in your inbox which you will never touch again, I recommend you do the same. It is very freeing.

The inspiration behind this bold movement towards being more organized is a book I just finished reading (“Pragmatic Thinking & Learning” – PT&L). One of the gems listed in there was something from David Allen’s Getting Things Done:

  • Scan the input queue only once
  • Process each pile of work in order
  • Don’t keep lists in your head

Like many engineers, I was on top of the third one – my desk at work has several sticky-notes attached to various surfaces, we use Jira for bug tracking, and I have to-do items jotted down in notepads. The act of cleaning my inbox put me in a position to do the first two points – it is really hard to classify things when you are looking at that many. When you only have to look at 10 or 15, categorizing them by what you need to do now, what you need to ask someone else about, what you just need to be aware of, and what you can ignore, becomes possible.

Since I was conscious about what I was doing with each email I went through, I had another breakthrough – a lot of email (at least, email that I get) is more “FYI”-kind of things. By adopting a practice Hunt mentions – creating a private wiki for use as an “exocortex” – I suddently had a place to put the several links I get sent from friends, colleagues, and newsletters I subscribe to. I no longer felt like it was a big deal to “archive” something, since the links I care about are now on a wiki page that only I can access, categorized and labeled, and thus much easier to find than they would be if they were still in my inbox.

I highly recommend this book – it has many “Aha!” moments, it is a “technical” book you can actually discuss with non-technical people, and it makes the reader see learning, thinking and working in a whole different way.