Licenses can be a dry topic for some but have been key to my teaching. Any of my students and colleagues during the past twenty plus years know me as the teacher who was always pushing the use of Linux, Eclipse, Apache Emacs and other open source tools that use open source licensing. This is an important part of my practice and my tool set and remains so. Read on to get a view on how my teaching has changed to include more focus on open data and open practices.
If you are reading this blog post, you have some form of data. Your data comes in various forms: photos and videos of your family and friends, assignments for school or work, saved printouts in pdf format of important transactions, scans of receipts of purchases. That last one is really important now that almost all retailers use cheap paper and printers that cause your receipts to fade to blank within weeks.
A simple option that will work for many people is to put your files on Dropbox, Google Plus or some similar cloud based solution. The advantage here is that the setup is simple. Of course, do not rely on only this option. If the company disappears or accidentally loses your data, you will find yourself out of luck. These services will also keep around files that you deleted (and older versions) for a certain amount of time or number of transactions.
The other issue with these options is that your private data is visible to that company and anyone or government that wants access to that data.
There are ways to encrypt your data before it is backed up so that you effectively are backing up “noise” to the cloud. That will be another post for another day. If you want a good reference about how to secure your Dropbox account, please check out this and other excellent blog posts at How-To-Geek.
My Current Setup (Working Files)
I use Dropbox and Google Drive for various files but mostly for data I want to share easily with others.
For my personal data that I need to have access to on multiple computers (and mobile devices) I have two setups for two types of data:
the often changed data and small files
large files and projects that do not change so often
The first set is stored on my own git repository on one of my personal Linux servers. Git allows me to track/store all changes to the files so I can roll back to older versions if necessary. Git also makes synchronizing changes to my files across various computers easy which is important since I use four separate computers for my work on a daily basis.
Git really is not a good option for large files so I have them in another setup that I simply mirror/synchronize across my systems using rsync over ssh between the machines.
My Current Setup (System Backup)
I have protection with redundant copies between my working machines but I also keep two sets of backups of my main workstation/server. Over the years I have tried various systems but recently I settled on rsnapshot.
This system is very simple to configure, it does imaging backups of my system with the following schedule:
every 4 hours giving six rolling versions over the previous 24 hours
every day giving me access to the previous 7 days of file version
weekly which I have setup to maintain the previous 4 weekly versions
monthly and keeping the previous 12 months
yearly and planned to keep 10 versions of yearly snapshots.
Since this is imaging backups, the total space used is only the size of one copy of each file and any saved previous versions of those files that have changed.
And of course to avoid disaster of a single site with those backups on the same machine’s I also have a second configuration to a backup drive set (RAID 1 redundant which is the same setup as my main storage) that I run weekly.
Plus I have another backup I run every semester or so and store in another location. Yeah, I’m paranoid.
Click the image by Robert Jacek Tomczak for full credit.
Are you backing your data up? Was this helpful? Let me know.