Philip Kucheryavy, Software Engineer in the Operations Team
- He is 24 and has a beard
- In love with Linux and Python
- Has no diploma of higher education
Dmitry Trofimov, Head of Front-End Design for UI of WoT
- Interested in coding since he was 14
- At 17, he wrote a simple GUI for MS-DOS
- In addition to World of Tanks, his favorite games include Heroes of Might and Magic III
- Big fan of Java
- Has a number of diplomas and certificates in programming
Dmitry Ovchinnikov, Web Developer
- He came to web development through Perl
- Has three diplomas of education
- Hates wires and recognizes only Bluetooth/Wi-Fi
- After seven years with Gentoo, switched to OS X and has no regrets
- Hides that he knows PHP
- Over 3,400 employees work in the offices of Wargaming all over the world
- More than 15 games released by WG since the company was founded in 1998
- 100,000,000+ users registered in projects of Wargaming
- In 50 countries around the world, World of Tanks Blitz reached Top 3 applications in App Store and was downloaded more than 5,000,000 times
Our server is driven by BigWorld engine, which is written in C++ and Python. All that is critical for speed is written in C++, the rest is done in Python.
We have a lot of hardware, a huge Zoo. There is no way to go without automation here. To automate things, we use Ansible, Fabric and Puppet. And we monitor with Zabbix.
We are looking into fashionable virtualization solutions like Docker but still do not want to let it into production. Currently, we are trying to transfer the local instances to such virtualization, as it provides major savings. Now they are also hosted by the virtual machines, but these are more “serious” solutions that are not so specialized, such as VMware and others.
A portion of everything is written in Erlang, such the transactions. But these are is only the external additions to the engine, subsystems acting as individual crutches, like the message delivery and communication. The engine is not involved in this.
We have a very particular development process, a lot of tools that are very customized to our needs. That is, we use open source, try to contribute, hold all sorts of meetups (I personally love to participate in them). At the same time, we have some very particular things that should not be put in public. Not because it’s a big secret, but simply because no one would need them, except for us.
We have our own hardware. We personally select the necessary hardware, buy it and then operate it. Of course, the cloud is great, but it is expensive and not always convenient.
If suddenly there is some sort of “explosion” in the load, we always have a back-up capacity on cold start. It is a sort of fallback option to quickly bring up another cluster in case of load jump.
When you fully own the hardware, you can realistically assess the situation. Let’s say, we have played for a long time with the hyper trading and still are trying to evaluate its pros and cons. In some situations, it only interfered with the work and we deliberately disabled it. By having access to the hardware, we can personally update the BIOS firmware and see what happens there.
In general, if a need arises, we can rearrange everything and bring the stack up within a day or less. Naturally, we try to load balance the hardware by projects, and when we need to shift the hardware somewhere else, it is not a problem, as it can be done quite quickly.
We handle the scaling mostly with RabbitMQ. In addition, BigWorld has its own connection technology in clustering (between machines and between clusters). A Russian server today is nine peripherals and a center. BigWorld has its own technology for connection of peripheral with the center, and its own protocol. The “Rabbit” is used for connection between the BigWorld and the web, separately, as a stack.
Our QA have access to the magic button “make me feel good”. That is, at any time, when they need a certain version of an environment or project, they press this button, and strange as it may seem, they start to feel good. I believe that everyone should seek to automate his work as much as possible.
Mostly, we have monthly development cycles. At the end, we roll out the production. More often, it happens locally. We roll out the production under manual control. This is partly automated and partly not.
We deploy directly from source, no packages. As a matter of fact, we work with large amounts of code, and nobody wants to compile each time the packages of many gigabytes. Let alone when we just need to make a patch. In addition, the delivery is provided simultaneously to many servers. This immediately fills the channel and resources, which at the same time also experience the load from the players. In other words, everything must be done very carefully.
What is BigWorld on the client side? Ironically, it is the same Python as on the server. In our company, BigWorld is connected to technology that allows to integrate Flash into the game client. There’s a virtual machine, a proprietary implementation from AutoDesk. There is also a GFX player that runs Flash inside the client.
Yes, we use Flash as the UI for Tanks.
It is believed that Flash is dying. Let me say that this is nonsense. The trend is that Flash becomes a highly specialized technology. Yes, all sorts of Flash games on the Web, most likely, will die. But in such major projects as “Tanks”, “Ships”, and so on, Flash has proven to be a very popular technology. It is perfectly suited to meet the challenges in terms of performance and in some other areas… like adding buttons and that sort of thing.
Why Flash? First, when we took up Flash, there were no alternatives like HTML5. Secondly, Flash has a native support for HTML and we do some stuff by using HTML markup. Actually, I would argue whether HTML5 is faster or provides higher performance. At least for our client, the analysis revealed that HTML5 would provide less performance than Flash.
Switching to HTML5 would mean retraining a vast team of Flash developers for a new technology, which cannot be called an optimal solution. And, of course, don’t forget that, in this case, everything would have to be rewritten. This is not something that we want now.
Personally, I would not count on HTML5 at this stage of its development, given that it doesn’t stand very well in the market. It has competitors that are advancing quite actively, including Mozilla, Unity, they are now a very close relationship to develop this technology. As a result, HTML5 might step aside altogether.
By the way, we tried not to use Flash by making UI with available alternatives, such as Python components. As it turned out later, these were just terrible crutches, we constantly lacked something. Besides, what was available couldn’t deliver what we needed. We began to think what to do about it and started to write, on our own, various components. As soon as it came to complex animations, effects and other things, it was clear that we need a very serious editor to define various visual things, animate them and make them “alive”. Flash was the only truly powerful technology that was able to handle it all. Since then, we keep using it.
Even though we are looking for an alternative to Flash, so far it perfectly handles the current tasks. For now, we are not planning any internal implementation.
In the Flash, we use all sort of open source things, like GreenSock, ported to standard Java libraries in order to work with data structures.
The only thing is that, to some extent, finding flash developers has become a real problem. The fact is that our demands in terms of Flash have grown considerably, while almost all of the most experienced guys are either already working in our team or are engaged by similar companies. It has become really hard to find a good and experienced professional in the market.
Of course, Flash is not the only client technology. For different platforms, we use different technology. So, for example, we craft a separate client for Xbox; for the mobile version, we also have a separate engine, in the development of which is actively involved our studio. We have many different technologies.
On the Web, we use a Python, Django, memcached and MySQL stack, they all work well.
Amazing as it may see, we don’t store user data but transfer them further: we just authenticate them and transfer. It is difficult for me to estimate what is going on out there behind this “large stone wall”, but as far as I remember, there’s MySQL. Or more precisely, Percona.
We are still experimenting with different MySQL forks. Basically, everything depends on the hardware. All data related to BigWorld and the web rely on MySQL. The web has its own databases, and Big World – its own, but they are able to communicate with each other through RabbitMQ.
We are using Percona and MySQL. In addition, we look at them on different operating systems. I know for sure that we are not engaged in super optimization, our solutions are mostly out of the box.
RabbitMQ can be easily clusterized, and it fits our tasks. Historically, we are actively using it. For deployment, we are developing various proprietary tools (for logging and other goodies).
We pay attention to modern and trendy stacks, such as Kibana and Logstash. For them, we are considering, for example, ZeroMQ. But the cluster needs a centralized solution, where ZeroMQ doesn’t fit in, as it is decentralized. RabbitMQ can be better integrated in our scheme.
We have our own Hadoop cluster. The data guys are working with it for statistics and related things. We collect these data not for the players but for ourselves. We aggregate various statistics, we analyze them, and then our analysts make calculations.
We also use NoSQL databases, but not for production. In our deployment, we use MongoDB to store statistics and information: We use it to collect statistics from servers and process them further by our proprietary tools. Roughly speaking, it’s just a JSON data storage. This is not production, it’s just a self-help.
Challenges and fails
There was not a single project where we managed to completely avoid fails. Personally, my first fail occurred on a test server. This server was also involved in production, but it was intended not for the players but for ourselves, to run the tests. It was my very first deployment in the company, and at the time, I was with the company for only three months. I rolled out the code, we tested it, and it turned out that I provided a completely wrong code. I had to rework it and ask QA to check everything again. I was ashamed. This situation made me remember very well that it is worth to double-check everything and to be more attentive.