BoxUp

BoxUp was created to solve an issue with Linux - Windows Sub-system. I needed the ability to quickly transfer files from one system to another, without the hassle of mounting drives and navigating to them..

View the Repository

What Is This Project

BoxUp was my first attempt at any directory hosting or watching software. It takes a directory you give it, and will copy that directory to any operating system and folder you want. It does this by taring and gzipping the directory and all its contents into a stream to be transferred over HTTP.

What Made Me Want To Do This Project

I wanted to explore using streams over HTTP as well as looking into gzipping and taring. Streams are interesting to me as they allow for large amounts of data to be transferred over longer periods of time, which is something we all use everyday. As of writing this, I am streaming music to my phone, something I want to look into in the future. And as for gzipping and taring, these are two of the easiest ways of zipping and compressing directories into convenient files for download.

What Was The Original Design and How Did It Change Over The Course Of The Project

It was originally going to just be a simple “go get a file” kind of application intended for when you have a file on one pc, and want to transfer it to another. The original idea came from using the Linux sub-system on Windows, and struggling to get a simple text file from the Windows operating system to the Linux one. I am positive that there were ways of doing it, and they probably weren’t that hard. But I sensed an opportunity to learn and challenge myself, so I took it.

The original design was a command line tool which you give a path to a file, and it would simply become a self hosted single endpoint API until the process was stopped. Then on the other machine, you would run the same CLI and call the API and retrieve the file. I still think this is a cool feature to have, as there is no need to run a process until the tool is needed. However, I realised that if I wanted to host multiple different files from different locations, I would need the self hosted APIs all to be unique. I would need to give them all individual ports, but I thought having to manage ports was gonna be a pain for the user (Me in this case) as you would need to specify the correct port for the file you wanted.

I had two ideas on how to tackle this problem. First option was to limit the number of self hosted APIs to 1. This way I wouldn’t have to worry about ports and could keep the idea of not running any process unless its needed. The second option and the one I eventually went for was to create an API process to run that would host all the “Boxes” by name. This made the UX much better as they could use boxup get box-name to get the files they wanted. Once I started on my API I branched into looking at directories as well as just files, this is when I decided I would allow entire directories to be “boxed” up.

Hopefully this has explained why I decided to name it BoxUp, as well as the reason I thought of the project in the first place.

What I Learnt

This project has taught me quite a few things. Firstly I now understand how the streams can be nested in each other to provide for easy reading and writing. This is something I didn’t really know about despite being so simple, nor how useful it would be. Below is a snippet of the BoxUp code where I use this:

...

gzipwriter := gzip.NewWriter(writer)
tarball := tar.NewWriter(gzipwriter)
defer tarball.Close()

...

As you can see, its super simple to take multiple writers and stack them up on top of each other.

Secondly, I got to learn and use the cobra library to create my CLI. I have had my eye on cobra for a while, as it has been used by quite a few large applications. Its extremely simple to use, but can create powerful and professional CLIs, quickly. My CLI is rather simple and doesn’t require too many complicated flags or arguments. However I made full use of the built in --help documentation and the code generation for creating the each of the commands. Below is a code snippet of the add keyword:

var addCmd = &cobra.Command{
	Use:   "add",
	Short: "Adds a new Box to the Server",
	Long: `Calling this command will request a new Box to be added to the Server.
	
	NOTE: The path is relative to the server location, not where this command is being called.`,
	Args: func(cmd *cobra.Command, args []string) error {
		nameFlag := cmd.Flag("name")
		pathFlag := cmd.Flag("path")
		if nameFlag.Value.String() == "" || pathFlag.Value.String() == "" {
			return errors.New("You must specify both --name and --path")
		}
		return nil
	},
	Run: func(cmd *cobra.Command, args []string) {
		host := cmd.Flag("host").Value.String()
		port := cmd.Flag("port").Value.String()
		name := cmd.Flag("name").Value.String()
		path := cmd.Flag("path").Value.String()
		fmt.Printf("host:%v\n", host)
		fmt.Printf("port:%v\n", port)
		fmt.Printf("name:%v\n", name)
		fmt.Printf("path:%v\n", path)

		err := addBox(host, port, name, path)
		if err != nil {
			fmt.Printf("Error adding new Box: %v", err)
		}
	},
}

func init() {
	rootCmd.AddCommand(addCmd)
	addCmd.Flags().StringP("name", "n", "", "Specify thw name of new Box")
	addCmd.Flags().StringP("path", "p", "", "Specify the path of the Box")
}

I have removed the comments to free up some space, but the logic is all the same. As you can see you just create a function that is callable by the root command and pass it into the rootCmd.AddCommand(). This also allows you to create multiple layers of commands so you can have something like boxup add box -h -p -n. There is also a section where you can do validation separately from the actual run functions, helping the code to be more readable and keeping validation and logic separate.

What I struggled with

I struggled a lot with understanding how to use tar for folders with multiple layers. I kept getting confused with the headers and how you didn’t have to write anything for directories. After doing some research, I found some different examples and came up with this simple snippet of code:

return filepath.Walk(Boxes[name].Location,
		func(path string, info os.FileInfo, err error) error {
			if err != nil {
				return err
			}
			header, err := tar.FileInfoHeader(info, info.Name())
			if err != nil {
				return err
			}

			if baseDir != "" {
				header.Name = filepath.Join(baseDir, strings.TrimPrefix(path, Boxes[name].Location))
			}

			if err := tarball.WriteHeader(header); err != nil {
				return err
			}

			if info.IsDir() {
				return nil
			}

			file, err := os.Open(path)
			if err != nil {
				return err
			}
			defer file.Close()
			_, err = io.Copy(tarball, file)
			return err
		})

As you can see from the code, go has created a method for tar where you can pass in a os.FileInfo and it would build most of the header for you. After that, all you need to do is check that the header.name doesn’t contain the basePath of the box location, and write the header. Once the header was out the way and I understood how it works, all I needed to do was ignore writing directories and only write if its a file.

What Improvements Do I Want To Make / What Are The Chances Of That Happening

There are a number of improvements that I want to make to this project, as I feel it hasn’t reached its full potential yet.

Firstly I want to bring back the “self-hosting” feature that I originally designed BoxUp around. It would just be a simple addition, where you could only run one instance of the “quick-host” feature at once (That way I can avoid the issue pointed out above in What Was The Original Design and How Did It Change Over The Course Of The Project). The reason I think this is a good feature to introduce is the ease of use that it brings. You could now just download the client on both ends and still use its main functionality, without having to worry about running a server application. The Chances of this happening are probably quite likely as I plan to use BoxUp at home and for future projects.

Secondly I want to take the gzipping and taring code and put it in its own package called “boxup/compression” so that I can quickly reference it in other projects. I have a strong feeling I’m going to use this code again in the future, so it would be useful to have it easily accessible. I feel this is quite likely as well, because of the usefulness of having it.

I might eventually also look at adding some file system syncing so that if a box changes on either system, it is updated on all others. I believe this would be quite a large task and will require me learning new libraries to handle it. I don’t think that this change is likely to happen unless I really need it for work or to make my life considerably easier.