Research group digest

The main idea of using TPC-W benchmark was to have a tool that would measure different stats like execution time and throughput. Output file is in Matlab format, which makes it a bit complicated for understanding, so I have almost no idea what all obtained numbers mean :)

The test can be run in three different modes: browsing, ordering and shopping. Each mode defines how many of the queries made to the database are read-only and how many update the state of the database. For today, tests in all three modes were run, on the system with only one database (no replication or distribution). The parameters given were following:

  • Number of emulated browsers: 20
  • Ramp-up time: 100 seconds
  • Ramp-down time: 50 seconds
  • Request images: false
  • Number of customers: 2880
  • Number of items: 1000
  • Start EBs Incrementally: false (do them all at once)

I will not provide here full output files, since anyway they are not user-friendly at all :) The WIPS means the number of Web Interactions per Second (i.e. throughput). This is basically the amount of transactions that the system is handling. TT seems to be transaction time or think time, but I'm not sure. Shopping mix

Browsing mix

Ordering mix

As the next step I think it is useful to run test with different parameters and only afterwards move to running them on the system with replicated database.

A sample setup of qemu machines and VDE networking components is described. Configuration parameters for qemu might not be optimal since the emphasis was at the moment on the network configuration.

Tools & Components Tools used for setting up following VDE network are:

  • vde_switch - a virtual switch
  • dpipe - two-way pipe here used to connect two switches
  • vde_plug - connects virtual machine (VM) to a switch
  • vde_plug2tap - connect tap device to a switch
  • qemu - for running VMs Network layout

Sample network setup consists of two switched networks. Q1, Q2 and Q3 are qemu virtual machines. S1,S2  are VDE switches to which VMs are connected. S1 and S2 are connected using dpipe and vde_plug-s tap0 is an interface on the actual host machine which is running S2. tap0 is connected to S2 using vde_tap2plug.

Scripts Setup scripts are currentlty hosted elsewhere. Click on the link to see the code.

  1. Launch S1 and S2, boot machines Q1..Q3 
  2. Connect S1 and S2 to external world
  3. Stop qemu machines and tear down VDE network
  4. Sample /etc/network/interfaces file for qemu machines

Useful links

Found a nice thesis "Database Server Workload Characterization in an E-commerce Environment" by Fujian Liu, which could be really helpful for cloudEco project. Some ideas, though, suggest that we might be going quite the wrong way:

_It is very difficult and costly to replicate the database server while maintaining data consistency efficiently in E-commerce systems._ _ Suppose several replicated databases are grouped together to serve queries via a standalone load-balancing machine. Assume a dedicated master database broadcasts any update, such as a new on sale price, to all other databases when it comes to the database group. An update is completed only after a transaction is committed across all the databases to ensure data consistency._ _This synchronization would be very costly, since the transaction data at an  E-commerce site is usually updated with a high frequency, sometimes in a burst mode._