Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Why You Should Be Using Virtualisation In Your Development Environment (morethanseven.net)
84 points by danw on Nov 5, 2010 | hide | past | favorite | 42 comments


We're big fans of virtualized development environments at my startup. Not only is it very helpful in preventing weird environment-related bugs like the autor states, it's a huge shortcut when it comes to bringing new developers onboard. Rather than have them spend hours tracking down links to all the right versions of your various libraries and installing things in the appropriate order, you hand them your image. For us, setup time for a new developer has gone from about a day to about 30 minutes, and most of that is time spent downloading Virtual Box.

In addition, if you get a little too clever and accidentally wipe out something crucial in your environment, you just spin up a new VM. It's a tremendous timesaver.


I'm assuming Mitchell hasn't seen this yet, but I'm sure I speak for both of us when I say thank you for including Vagrant in your post.

I would like to point out a few things.

1. Another thing to set up.

Streamlining project setup is one of Vagrant's primary goals. Instead of putting someone, who potentially has no background with the app, through a 20 step process installing all the application dependencies you can simply tell them to run a few commands.

2. It's a "Ruby tool".

While I don't think your intention was to pigeon hole it, Vagrant is really meant for any development environment/language/setup. Its just unfortunate that the only supported provisioning tool uses Ruby for its DSL (are there similar tools in other languages?)

(NOTE: its been a while since I contributed meaningfully to the project)


I'll change the Ruby tool bit. I mentioned elsewhere in the post that it's a great tool whichever language you're using.

Regarding tools other than Chef for provisioning, while there is a Python clone somewhere Chef will very likely have the ability to use other languages for recipes very soon. I've seen at least Java and Python mentioned.


I did finally see this entry. It was a great read and I agree completely with it. Note that I posted my comment on the blog directly: http://morethanseven.net/2010/11/04/Why-you-should-be-using-...

Since clicking on that just to read my comment can be annoying, I've copied it below as well:

=========================================================

Yes! Yes Yes Yes! Virtualization for development is extremely important and I'm glad you wrote this. Also, thanks for the hat tip to Vagrant, I appreciate it. I've given a few talks on this and it always amazes me how many people are so comfortable with the status quo of developing on their own machines with apache/mysql/etc installed directly on their machines. Its a disaster waiting to happen.

I want to point out to the many people using VMWare out there: VMWare Fusion is great, yes, I won't argue. Their shared folders are better, again I agree. But Vagrant does make use of NFS which is faster than even VMWare Fusion's shared folders in order to get around VirtualBox's terrible performance.

And just because I have things to say, here are a few of my own remarks against the arguments against virtualization of development:

* Speed - Given enough RAM (which for a standard web application, shouldn't be any more than 512 MB to 1 GB for the VM instance), the speed difference is noticeable but not detrimental to your productivity. For regular web requests you won't notice any speed difference. For CPU intensive background tasks, you'll probably see a 1.5x slowdown. Again, unless you're running 5 hour tasks during development, it shouldn't be a big deal, and the benefits outweigh these issues, in my opinion.

* Lower level than you're used to - Then get your friendly sysadmin to setup a base image to use for your site. A modern sysadmin has many scripts made to automate the setup of the environment for production. There is no reason these scripts can't be used to setup your development as well. Use it! Stay in your happy place and just boot up a VM and code! (Although its my opinion every developer should take the time to learn their software stack top to bottom)

* Something else to setup - Once. You only need to learn it once, and its repeatable and dependable. I would argue that setting up a new software stack every time a dependency changes on your web app is far more than one more "something else to setup."

* Developer workstations should be personal - Right! I agree 100%. So stop installing server crap on your personal computers. Keep your Twitter clients away from your web servers. Use a VM and keep your workstation personal.

Thanks again, Mitchell


I really wish you could virtualize OS X. I would love to build a beefy Linux box as a base machine and virtualize OS X and Windows.

It's my understanding you can do it, but it's a bit hacky, right?


I ran a VMWare image of OS X a few years ago. I don't know what it's like now, but back then the lack of an OpenGL driver was the deal breaker. OS X's graphic shell uses Quartz Extreme on top of OpenGL, without GPU acceleration it was virtually unusable.


I'm running a VMWare image of 10.6 (Snow Leopard) right now. I didn't set it up, but it does indeed run. It's not perfect. It had problems running Safari 4, but now runs Safari 5 fine. I use it for browser testing. That being said Flash video crashes it hard, HTML5 video crashes it gently (like caressing a child's face with a butcher knife).

I have a pretty capable machine (Core i7 920, 6GB, GTX260 video card), but speed wise it is quite usable and I run all sorts of osx-only apps on it (like the omni suite) every day.

So.. i guess like 7.5 out of 10? Buggy, but really getting close.


I've also found it's generally better to keep your base installation as sparse as possible, and use virtual machines for anything that does not absolutely need to be on the host.

That way you can set up servers, experiment with new platforms and/or applications, hack^Wlearn config files and system internals, etc. without risk of bogging down or blowing up your machine and having no recourse but a full reinstall and reconfig.

I've got core Ubuntu Server, OpenSuse, and CentOS server images, that include extras like git-core, htop, ssh, and a few other utilities I universally depend on, and that I can copy, deploy, and configure for whatever specialized purpose that comes up.

<3 it.


Perhaps a middle ground could be using a virtual machine as a staging environment?


This is what we do. The biggest benefit here, for us, is that if a release goes horribly wrong, or our testers find a disastrous bug, we can easily roll back to a clean state and only lose the time it takes to load the VM.

I've tried using a VM for doing development, but I've run into the issues mentioned in the parent article: Visual Studio 2010 is memory and CPU hog. I've maxed out the RAM in my laptop, and it still ran too slowly to be really useful.


Not strictly in the spirit of the post, but I use a nice VM setup I thoroughly recommend:

- Host: OSX (2.4GHz i5, 8GB ram)

- Guest1: Ubuntu 10.10 (1.5GB ram allocated; acts as a LAMP server, available to Host and other Guests)

- Guest2: WinXP (1.5GB ram allocated - I use it for IE and for Xara)

This config eats RAM, but affords so many advantages:

1/ Makes my LAMP server portable. Can set up on new computer in no time

2/ Means I can experiment with server software and can revert when I break something

3/ Enables easy cross-PLATFORM browser testing which is vital if you care about fonts and pixels

4/ Can easily fake slow internet connections etc for web dev

5/ Gedit :P

6/ Nautilus :P

Major disadvantage is filesystem access speed.


PS, if anyone knows a solution to the file access speed issue (not just the filesystem of the guest, but also what ever you share with it from the host) please tell me. Parallels, VMWare Fusion and VirtualBox all have this issue and it drives me mad sometimes!


I'm not sure if this helps in your situation, but when I'm running VMware I don't share any local directories. I use rsync to manage the virtual server directories since it's similar to how I deploy to my remote servers.

Having to manually "deploy" code to a local virtual server can be tedious, so running a background process that watches your local directories for changes helps a lot.


I understand you use the rsync approach since its similar to how you actually deploy, which is completely understandable, but I just wanted to note that John and I struggled with the shared folder performance issue + Vagrant for months, and we did a LOT (months) of real world tests with various solutions (John at one point even integrated background automated rsync directly into Vagrant), we found that NFS is really the only satisfying solution (for us, at least).

NFS allows us to edit code locally on our host, and have it "instantly" be ready on the guest. After just half an hour of working, the various inodes are cached on the guest and file system access over NFS is mostly native-speed since it only needs to grab changed pieces.

The end result: We completely ripped out background rsync despite John working weeks on it, and we put in built-in support for NFS, which has been going strong since around July now and is happily at work at many places that use Vagrant :)


People might be interested in DoubleDown (http://blog.devstructure.com/introducing-doubledown) from the DevStructure guys. It uses rsync in a similar background manner and it's designed with this exact problem in mind.


Can you elaborate on that background process? Is that something built into rsync?


It's actually something I've been working on that I want to put up on github. I hacked up a Python version of it as a proof-of-concept but rewrote it in C as an installable executable using autotools.

Basically I tell it which directory I want it to watch for changes and it does automatic syncing by piping rsync. I keep a local and remote signature of the files and their timestamps. When it first starts up, it pulls the server sig file and compares it to the local one. If there's a mismatch, the server is updated. From there it manages the signatures locally until they're different.

Writing the sig file to the remote server is done in the same rsync pass because it's stored in the local directory as a dotfile.

I realize rsync does its own checksums, but using my own crude signature files makes it so I don't have to keep calling rsync. I only call it when something changes.

I also have some stuff I'm working on that ties into auto-restarting servers when syncing finishes, rolling server deployments for no downtime, db migrations, etc.


Sounds like some pretty cool stuff! I hope you post about it when you do decide to release it. :)


SSD will help make it faster, but the OS will also be that much faster so it won't seem faster by comparison, only in an absolute sense.


From the experience I've had with running Parallels and the like, I disable any sharing of drives. I dont need drag and drop files from and to each one. After disabling that, I see better performance.

Disabling options like Application sharing, Drive sharing, video card 3D support, and the like will help. If you dont need it, turn it off.


Vagrant automates the shared folders through nfs to work around VirtualBox's abysmal performance.


I use a similar setup.

- Base: OSX (3.06 GHz Core 2 Duo, 8 GB RAM, 500 GB HDD)

- Guest 1: Windows 7 3 GB RAM, for VS 2010, for Silverlight development

- Guest 2: Win XP 512 MB RAM - for testing with IE6

- Guest 3: Ubuntu 9.04 (need to update)

This works really well, and Win 7 is snappy and VS 2010 works flawlessly. On OSX I always have Chrome, AntiRSI, Things, TextMate, Terminal, NetNewsWire and iTunes open.

A big advantage is that I can fearlessly install things on Windows, I simply back up the VM before any installs/updates.

This config does work pretty well, except 8GB RAM does fall short sometimes. In my experience, VM performance takes a nose dive the moment OSX starts swapping, this is something I watch and now have to re-boot almost everyday (gone are the month long sessions).


Now if I could only get some vmware <-> ec2 direct compatibility up in here ... proprietary conversion tools = fail.


Note that its pretty much a must if you plan to have a MongoDB server as part of your dev environment on your home box - Mongo by design eats all the RAM it can find to map files to, and if you have Apache, Solr and MySQL running on the same box, things can get ugly :)


Why do applications think they can manage my memory better than my kernel?


Because your kernel is general-purpose and has been tuned with certain balance between latency, throughput, and memory usage (memory management takes memory!) If your application deals with things < 4kb in size, has different performance needs &c, then it may achieve higher performance than just leaving everything up to the kernel.


Actually, MongoDB does leave memory management to the kernel by using MMAP to access data files. Most kernels will allocate a lot of memory to the disk cache, which can make it look like mongo is eating all of your ram.


This. It's one of the brilliant things about mongo. The downside is that after a reboot it's up to you to get the OS to warm those caches back up.


If all you want to do is limit memory usage, process control groups can do that with less overhead than a VM.

kernel/Documentation/cgroups/memory.txt circa line 255


Our servers run CentOS. None of our developers would dream of running that as a desktop OS. Some just use Ubuntu and call that good, but I like to have my testing environment as identical to production as possible, so I took an ancient laptop with a busted display and slapped CentOS on it. As I mentioned in another comment, I set up samba so I can have apache pointing directly to my project folder on my dev machine.

It makes for very convenient testing and it has helped me catch issues others didn't see on their distros, mostly because CentOS has ancient versions of everything and sometimes people don't realize they are relying on a feature that was introduced later.


One tip for the overly enthusiastic (like me): Unfortunately Vagrant does not work on Windows 7 64 bit (yet).

See: https://github.com/mitchellh/vagrant/issues/issue/194


If your really serious about developing stuff what will end up running on a Linux server, I think you should really be using Linux on the development machine. I know that is not always practical, but more people should do it. No one would dream of writing a Mac app on Windows, and I don't see why Linux should be any different.


If you're targeting a specific platform, it can make sense to use a virtual machine instead of installing that platform on your development machine. As far as code is concerned, it doesn't care that you're running it on a Linux VM inside a Windows dev machine rather than directly on a Linux dev machine.

That being said, if you're trying to target Mac or Windows, it's generally a good idea to use that specific platform's tools. It's reasonable to develop Windows and Mac apps on Linux but it may not be the best environment to do so. It's really up to you as a developer to determine the best toolchain and processes based on what you're trying to accomplish.


If you're targeting a server it can absolutely make sense to run a different OS than your target platform on your /desktop/ computer.


Thanks for the great post. Came to know of vagrant for the first time reading this, and have instantly fell in love.


How do you handle version control in this environment? Locally on the mac or in the virtual server?


Personally I edit the code on my mac using gvim and run and use git on the VM. I've used ExpanDrive to mount the directory from the VM for editing previously but I've been moving towards using vagrant with it's NFS shares.


The way I do it is to write all of the code locally and use git/svn/hg/whatever locally. I then rsync my changes to the virtual machine when I'm ready to test.


I keep all my code locally and have my testing machine (in my case an old laptop but should work the same with VM) map to that dir, so the server is alwas running the code that I am working on, no syncronization step slowing me down. :)


I just touched Test and test on mac and I see both of them just fine.


On what filesystem?

Certainly not the default HFS+, which is is case-insensitive but case-preserving.


It works as you described on my macbook. I was on my Mac at work before.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: