Have understood the world's largest PHP site, Facebook's background technology , they are now a 1 million we have to learn PHP web site structure: Poppen.de. Poppen.de is a German social networking site, relative Facebook, Flickr is a very small site, but it has a very good framework for integration of many technologies, such as Nigix, MySQL, CouchDB, Erlang, Memcached, RabbitMQ, PHP, Graphite, Red5 and Tsung.
Poppen.de currently 200 million registered users, while the project team of 11 developers, two designs, two system administrators. The site's business model with free value-added model, the user can use the search user, send messages to friends, uploading pictures and video capabilities.
If the user wants to enjoy unlimited messages and upload pictures, then have to pay according to different types of membership services, video chat and other services sites to adopt the same strategy.
Poppen.de All services are based on Nginx services. Nginx front-end servers with two per minute at the peak of 150 000 times to provide the requested load, each machine has four years of life, and only one CPU and 3GB RAM. Poppen.de has three separate image server, by the three Nginx server *. bilder.poppen.de 80 000 times per minute to provide requested service.
Nginx architecture is a cool design, many requests are handled by the Memcached, so a request to get content from the cache without the need for direct access to the PHP machine. For example, the user information page (user profile) is a content site requires intensive treatment, if the user information page to Memcached on all the cache, then the request to obtain content directly from Memcached on. Poppen.de of Memcached 8,000 times per minute to handle the request.
Architecture has three Nginx image server provides local image cache, user uploaded images to a central file server. When one of the three Nginx request an image, if the server does not exist in the local image, from the central file server to download to the server for caching and delivering services. This distributed image server load balancing architecture can reduce the load on the main storage device.
The site runs on the PHP-FPM. A total of 28 twin-CPU, 6GB of memory PHP machine, each machine is running PHP-FPM 100 worker threads. APC enabled to use the PHP5.3.x. PHP5.3 can reduce the CPU and memory usage of 30%.
Code is based on the Symfony 1.2 framework on development. One can use external resources, and second, to improve project development, and also in the framework of a well-known can make it easier for new developers to join the team. Although nothing is perfect, but you can get from the Symfony framework many benefits, so the team can focus more on business development Poppen.de up.
Website performance optimization using XHProf , this is out of a library open-source Facebook. This framework is very easy to personalize and configure, can be most costly cache server computing.
MySQL is the site main RDBMS. Web site has several MySQL server: a 4CPU, 32GB of server storage user information, such as basic information, photos, descriptions and so on. The machine has been used for 4 years, next step will be to replace it using the shared cluster. Based on this system is still designed to simplify data access code. Data partitions according to user ID, because the majority of sites are user-centric information, such as photos, videos, news.
Have three servers in the Lord - from - from the configuration architecture provides service user forum. A site from the server is responsible for custom message store, and now there are 250 million messages. The other four main machines - from the configuration relationship. Another machine configured by the four groups specializing in intensive NDB write data, such as user access statistical information.
Associated data tables designed to avoid operation, as most of the data cache. Of course, the database structure of norms has been completely destroyed. Therefore, in order to more easily search, database design to create a data mining table. Most tables are MyISAM type table, can provide rapid search. The problem now is that more and more table has been locked the whole table. Poppen.de are considering migration to XtraDB storage engine.
Web application framework Memcached a lot more than 45GB of cache and 51 nodes. Cache Session session, view the cache as well as functions such as the implementation of the cache. Framework of a system when the record is modified automatically when the data update to the cache to. Future possible options to improve the cache is updated with new Redis Hash API or MongoDB.
Began in 2009 in the framework of the use of RabbitMQ . This is a very good messaging solutions, easy deployment and centralized to this framework to go after in LVS run the two RabbitMQ server. In the last month, has put more stuff into the queue, which means the same time there are 28 sets of PHP server to handle 50 million times a day request. Send logs, e-mail notification, system messages, upload images and more things to this queue.
Applied PHP-FPM in fastcgi_finish_request () function integrated queue messages, asynchronous messages can be sent to the queue. When the system needs to send HTML or JSON response format, on the call this function, so users do not need to wait until the PHP script clean-up.
The system can improve the resources management. For example, in the peak of service 1000 times per minute to handle login requests. This means that there are 1000 concurrent update user table holds the user's login time. Using a queuing mechanism, can be run in reverse order to these queries. If you need to increase processing speed, only need to add more queue handler can even add more servers to the cluster to go, without modifying any configuration and deployment of new nodes.
Log storage CouchDB running on one machine. In this machine can modules / behavior log queries / group, or under the wrong type and so on. This positioning is very useful. Aggregation service CouchDB using the log before the station had to log on to PHP by the server log analysis to locate the problem, which is very troublesome. Now concentrate all the logs stored in the queue to CouchDB, you can concentrate on inspection and analysis of the problem.
Use Graphite collect real-time information and statistics site. Each module from the request / behavior to Memcached hit and miss, RabbitMQ condition monitoring and Unix load so. Graphite services 4800 times every minute update operation. Practice has proven to monitor what sites made is very useful, it's a simple text protocol and graphics can easily be used to plug Fangshi any need to monitor the system.
One cool thing is the use of Graphite also monitor two versions of the site. January Symfony framework deployed a new version of the code before deployment as a backup. This means the site may face performance problems. Graphite can be used therefore to compare the two versions online.
Find a new version of the Unix load form a high, so the two versions use XHProf performance analysis, identify the problems.
Web site also provides users two types of video services, one user uploaded the video, the other is video chat, users interact and share video. Mid-2009, a month to provide users with 17TB of traffic services.
Tsung is a distributed written in Erlang benchmark analysis tools. Site is mainly used in Poppen.de HTTP benchmarking analysis, MySQL and other storage systems (XtraDB) Comparative Analysis. Recorded with a system flow of the main MySQL server, and then transform Tsung benchmark session. Then play back the traffic, generated by the Tsung thousands of concurrent users to access the laboratory server. This test environment can be very close to the real scene.