Yahoo! JAPAN’s DevOps Journey

Yahoo! JAPAN’s DevOps JourneyYahoo! JAPAN is one of the top solutions-providing search engines in the Japanese Internet market. Its DevOps journey has been distinct and passionate. Recently, Microsoft senior technical evangelist Tsuyoshi Ushio met with Yahoo! JAPAN's DevOps team to learn more about their experiences. He shares that dialog here.

The Yahoo! JAPAN DevOps team: From left, Takurou Gotou, Hiroshi Yamaguchi, Minoru Tsuka, Tatsuro Mitsuno, and Teppei Yamaguchi

yahoo_01

Tsuyoshi Ushio, senior technical evangelist DevOps / Microsoft

yahoo_02

Tsuyoshi: Hi guys. Thank you for attending! Can you tell me the number of developers at Yahoo! JAPAN?
All: We have about 8,200 employees, including 1,800 developers.

Takurou Gotou

yahoo_03

Tsuyoshi: Huge! Can you tell me about the current status of DevOps at Yahoo! JAPAN?
Tatsuro:: We want to do "Continuous Integration" first. We want to improve both effectiveness and quality. We have a bunch of teams. The problem is the differences in the technical levels among the teams. Some teams are really cutting edge; however, some teams are old-fashioned. We need to help them learn basic DevOps practices. We have a lot of employees in Yahoo! JAPAN. I think we should implement DevOps; however, some people don’t want to do it.
Tsuyoshi: Tell me more.
Hiroshi: We used to have a top-down mindset and have a strong governance. Generally speaking, we were passive, not active. People said that, “If my boss decides to go for it, I will do it.”
Tsuyoshi: Interesting. By the way, you said that teams vary in terms of technical DevOps practices. What kind of differences exist between them?
All: Well, like this. (They discussed the differences using the following illustration.)
Tsuyoshi: Why does the surface level relate to the Jenkins CI?
Tatsuro: We have a very good tool (Karakuri CI). When we tag on GitHub, it will automatically test it, provision it, and deploy it. If the coverage is below the qualification (e.g., 80%), it won’t deploy it.
Tsuyoshi: Cool. Generally speaking, Continuous Deployment is difficult. However, if we use this tool and just write automated tests and set up a CI server, it will automatically deploy into production quite easily. These are the differences between "sky" teams and “deep sea” teams, right?
Minoru: Yes. But before we talk about this topic, I'd like to share our concerns about the tool. Sometimes we are worried because we developed the tool by ourselves.
Tsuyoshi: What do you mean?
Minoru: We see new technologies and tools every day. I'm not sure that we can keep up with the speed of technology. Also, the tool is really convenient. It makes us feel foolish—because we benefit from it without having to think much.
Tsuyoshi: Have you built any other tools?
Hiroshi: We developed Karakuri CI based on Travis CI.

Hiroshi Yamaguchi

yahoo_05

Tsuyoshi: Why did you develop it by yourself?
Tatsuro: We have a lot of legacy tools. At first, we customized a Jenkins to integrate these. However, we realized that it is much easier to develop from scratch than to customize the Jenkins.
Tsuyoshi: What an enterprising company you are. Cool. By the way, could you tell me the differences between a "sky" team and a “deep sea” team?
Minoru: Well, Tatsuro's team is a "sky" team. They are a small team and they belong to a subsidiary of Yahoo! JAPAN. So they can move quickly. They use Docker as an immutable testing environment. Of course, they use Agile development.
Tsuyoshi: Do you use Infrastructure as Code?
Tatsuro: We'll try Chef. Because Yahoo! (US) uses it. Some projects use Ansible.
Minoru: Sometimes, Chef can't clear our security policy.
Tsuyoshi: What do you mean? Do you mean Chef requires an agent but Ansible doesn't need it, or something?
Minoru: Well, yes, in some ways. For example, how to handle keys when we configure servers using Chef. We require a high security policy.
Tsuyoshi: Good thinking. Then, what is a 'deep sea' team?
Minoru: Sometimes it is like a huge waterfall. One team, which uses Hadoop, needs to operate very carefully, because they can't stop the system. Another team runs a service which operates 5,000 servers. The service automatically gets the libraries using a package management system. However, we have a few servers as a package management server. Which means 5,000 servers will fetch the libraries from only a few servers. It is like a DOS attack. If we try to change this system, we need to be careful.
Tsuyoshi: You mean "Excel Driven Development".
All: Yes. LOL

Author's note In Japan, a lot of documents are written for developing software using the waterfall process. Also, it's common to use Excel for writing specification and design documents. These documents are called "Excel Graph Paper." For example, How to lay out your document using Excel Graph paper technique (Japanese). Some people are critical of this technique because of its low maintainability. They called it "Excel Driven Development."

Tsuyoshi: Hmm. Huge differences between the two. It is like … some teams are in the Civil War era, but others in the future.
All: Ha-ha. If you go through one partition, you can fly 150 years. It could happen. Because we could have from 3 to 3,000 members for one team. We have huge variations.

Teppei Yamaguchi

yahoo_06
Tsuyoshi: Do you have standard tools?
Minoru: No.
Tsuyoshi: "Self-organized teams should decide the technology." This is an Agile mindset.
Minoru: For example, as an Infrastructure as Code tool, we want to use Chef as our standard. However, each team can choose their Infrastructure as Code tool. "Self-organized and small team" strategy gives birth to "unicorns." Tatsuro's team is a good example.
Tatsuro: TWe belong to a subsidiary of Yahoo! JAPAN. Our team is small and can move quickly. We are at the grassroots level. We try to share our insights among Yahoo! JAPAN.
Tsuyoshi: Great! Could you tell me about your DevOps journey?
Minoru: Yahoo! JAPAN changed the board members around 2012. The new COO said that we needed to improve their software process. He said the current software development process was not very good.
Tsuyoshi: Interesting.
Minoru: However, it didn't work well. The fixer was Kazuyoshi Takahashi.

Author's note Kazuyoshi Takahashi is a famous Agile coach in Japan. He formerly was with Yahoo! JAPAN.

Tsuyoshi: TWhat happened?
Minoru: Kazuyoshi thought, "We need technology evangelists." Then he said to me, "Hey, why don't you be a DevOps evangelist?" So I became a DevOps evangelist. Now we have 4 DevOps evangelists, 10 Agile and Lean evangelists, and 30 to 40 evangelists in other areas.
Tsuyoshi: Is this a grassroots-level team?
Minoru: Yes. We have no power. We have no authority and we are like guerrillas. However, it is much better than being alone. We need these communities. Once we created communities, we could easily collaborate with others. We could have joint events.
Tsuyoshi: When did you start?
Minoru: Around 2014.
Tsuyoshi: Can you tell me the outcome?
Minoru: We have a chat room called the "DevOps room" that was created in March 2014. We had 62 members then; now we have 512.
Tsuyoshi: Great! Can you tell me the reason for your success?
Minoru: We have meet-ups once every two months. About Fabric, Ansible, Testing and so on. Next, we are going to have a "ChatOps" meet-up. We have some case studies for it.
Tatsuro: Also, we share our statuses, like "I have finished a spike solution for ..." via the chat room.
Tsuyoshi: Do you have a silo problem? It is very common for a DevOps journey, isn't it?
Minoru: Yes. We had two silos: Operations and Development. We had the 'Wall of Confusion". We only saw each other when accidents happened. Now we have used-to-be-a-Dev people in an Ops team. Now we can break the wall. Before that, we had a lot of trouble with communication. Like --- "Hey, when is the release?" "Today." "I didn’t know that…."
Tsuyoshi: How did you solve the problem?
Minoru: Well, we transferred 'unicorn' developers to an Operations division.

Minoru Tsuka

yahoo_07

Tsuyoshi: What? They say that a great programmer is 10 times more productive than an ordinary one.
Minoru: Yes. They work 10 times more productively. That is why we decided to go for it. It was April 2014.
Tsuyoshi: Why did you do it?
Minoru: My boss and I discussed the effectiveness of software development. We thought these two divisions should collaborate more. So we needed to move some members among the divisions. I belong to a Development division. I was responsible for the idea because I came up with it. So I decided that we would transfer our ace programmer to the Ops division.
Tsuyoshi: Was the Development division OK with it?
Minoru: No problem. We could ask them for something. They transferred, but it was just 10 meters away. So we can talk to them easily. Also we can use a messenger application.
Tsuyoshi: Did it work?
Minoru: Yes. We hadn't automated our software engineering process well. Because of the 'Wall of Confusion." After joining the ace engineers, the Ops guys started to say, "Hey, develop a deployment tool" or "Develop a software distribution tool."
Tatsuro: The other teams got Operations people involved in software development. We ruled it. You deploy it, you watch the alert.
Tatsuro: Also, the Operations team started to send pull requests. And they started to help Development teams build an application from source code written in C.
Tsuyoshi: Awesome. You can successfully get Ops guys involved in sprints.
Minoru: One more thing. We created some chat rooms per a service. Both Dev and Ops were invited. We share information in the chat rooms. Both strategies worked fine. I mean, “Red pill: Transfer unicorn Devs” strategy and "Blue pill: Ops into a sprint" strategy.I think we need both top-down and grassroots.
Tsuyoshi: These teams must be top-level teams. How about ordinary teams?
All: Hmm. They want to apply DevOps. However, they think they have no resources to do it. They don't know how to do it. Also, we used original tools for it. It is different from the technology trends on the outside of our company.
Minoru: The deployment tool that we created can manage packaging, service-in, server-out, and provisioning of our infrastructure. It has 30,000 lines of code. It is extremely useful so we can pay little attention. Which means it will cost us when we try to move to another tool. Now we want to move to Chef. However, we use Chef now but from the tool.

Author's noteA package management system of Yahoo! JAPAN (Japanese)

Tsuyoshi: Does everyone use it?
All: All: Yes. However, we don't write test code for it. It works only serially, not concurrently. Sometimes it takes a whole weekend after you execute it.
Tsuyoshi: I attended DevOps Enterprise 2015. A lot of companies talked about installation of automated unit testing and legacy systems. Do you have any difficulty with these?
Minoru: Everyone understands we need to learn something new. However, some only mention the reasons why they DO NOT do it. They usually say, “We don’t have enough resources. We don’t have enough skills to write automated tests” and so on. We help them move on to the next step. They begin to learn a software development process and start to notice the inefficiency.
Tatsuro: The point is if they can handle automated testing. Which means CI works well. This is the difference between surface/sky and deep sea.

Tatsuro Mitsuno

yahoo_08
Tsuyoshi: I agree. Automated testing matters. They say that only 40 percent of Agile projects implement proper automated testing. As an Agile consultant, I always taught Test-Driven Development before starting a project. Before teaching TDD, I taught about OOP (object-oriented programming) at first. I used index cards to teach them quickly. I called it “Object-Game.” If you use this technique, participants feel like they understand OOP. They may not actually understand it, but that’s OK. Then they think, “Hey, I can handle OOP!” It gives them some courage. Then they can easily try TDD without fear. Then the Agile project works fine. This is my presentation at the Agile Conference 2011.

Author's noteAgile 2011 – Agile Education by Object Game)

Minoru: I agree. "To feel like someone understands something" is important. Also, we test high-level acceptance testing. Start with manual operation and then automate it.
Tsuyoshi: How did you persuade them to write the test?
Minoru: If they find a communication problem with the specification, we have to try to prevent it from happening. In this case, we let them write some test codes. We have no exception to this policy.
Tsuyoshi: Could you tell me the number of people belonging to sky/surface and deep sea?
Minoru: Well, 100 for "sky" skillful guys, 400 for "surface", 1,000 for "deep sea".
Tsuyoshi: How do you overcome the legacy system problem?
Minoru: It is very difficult for us to write automated unit testing for a legacy system. We write black-box automated tests instead. Sometimes, it is very difficult to read the code. So we try to re-factor it using the black-box automated tests.
Tsuyoshi: Do you have any problems with it?
Minoru: Well, a lot of people think they can't handle it. The thing is to let them think like, "We can try the first step". If they can implement CI, it will be much easier to let them progress further than they used to.
Tsuyoshi: Do you have a goal for the DevOps journey as a member of the original DevOps team?
Tatsuro:: Tatsuro: DevOps should become commonplace. Then we won't be needed. This is our goal. Generally speaking, your company doesn’t need an evangelist, right? We should just meet up as an internal community. We have a cultural way of thinking that "someone should decide something. Not me". Unfortunately, it remains that way. I want to get rid of this.I hope for an "autonomous" mindset.
Tsuyoshi: Thank you for your great interview! Could you give a message for someone who wants to try DevOps, please?

Work hard to be lazy. If you wake up in the morning and everything has been finished automatically, you will feel great, right?

Tsuyoshi (upper left) and the DevOps team. From lower left: Takurou, Teppei, Tatsuro, Minoru, and Hiroshi

yahoo_09

[bing_translator]