2022 Day 7

Posted on Dec 7, 2022

Part 1

Problem

Read in a list of lines which represent one of the following:

  • Command: Starts with a $.
  • Directory: Starts with dir and is followed by the name.
  • Regular File: Starts with a number (the file size) and is followed by the name.

The size of a directory is the sum of the regular files in the directory, plus any directories underneath it.

Find the sum of the sizes of all directories with a size less than or equal to 100000.

General solution

As we have a root directory and each directory can contain other directories and files, a tree seems the best fit data structure. Additionally, we will follow the Unix design in treating directories as files.

Files will have the following fields:

  • Name: This must be unique within a directory but may not be unique within the filesystem.
  • Path: Full absolute path to a file. Unique within the file system.
  • Type: Regular file or directory. We need this because we cannot tell from the other fields when a file is a regular file or a directory (e.g. whilst all regular files will have no children, it is also possible for a directory to have no children).
  • Size: For a regular file this will be the file size listed in the input. For a directory it will be the sum of all its children.
  • Children: A list of all the child files. Regular files will have no children, directories will have zero or more children.
  • Parent: Although we could find the parent file by searching the tree, including a field for it will make it easier to move up the filesystem as well as down. All files will have exactly one parent, except the root file.

The tree will start with a special root file, which will be a directory. Everything else will be an descendant of this file.

Solution: Go

First of all, we need to design a data structure for our tree. Two important things I learnt here are:

Struct fields should be capitalised if they are public, e.g. Name rather than name (this doesn’t matter too much in these exercises as everything is contained within the same package, and visibility only applies at the package level).

Structs cannot directly contain fields of the same struct. They can however contain a field which is a pointer to the same struct.

As an example:

type File struct {
    Parent File
}

will not compile. The correct code is:

type File struct {
    Parent *File
}

This makes sense and is similar to how self-referencing structs are used in C.

The data structure design ends up as:

type File struct {
	Name     string
	Path     string
	FileType int
	Size     int
	Parent   *File
	Children []File
}

Go will initialise struct fields with ’empty’ values, such as 0 for int, which allows us to create a File before we know its size. This differs from C, where the value of an uninitialised variable or field is undefined - and therefore could be anything.

We then read in every line of the file and work out whether it is a command, directory, or regular file, and add items to the filesystem tree as appropriate.

The first step is to create the root file, as every filesystem must have a root:

root := &File{
	Name:     "",
	Path:     "/",
	FileType: TypeDirectory,
	Size:     0,
}

We then check the line type (command, directory, regular file) and process as follows:

Command: If the command is cd, we either move up one level (..) or create that directory in our filesystem tree (under the current file, and only if it does not exist) and then change the current file to be that directory. If the command is ls we do nothing because we will process the directory listing when we read in the following lines.

Directory or Regular File: We split the two parts of the line up. The second part is always the name of the file, and the first line is either dir (for a directory) or the file size (for a regular file). We populate a File struct and add it to the children of the current directory. If the new file is a directory, we leave the size as 0 because we cannot calculate the size of a directory until we have all of its descendants.

After processing the input, we now have a filesystem tree. At this point we could be anywhere within the tree, but we need to ‘walk’ up to the top as our next operation will start from the root. This is trivial as we have a File.parent field, and we can tell when we have reached the root because the parent field will be nil.

for ; currentFile.Parent != nil; currentFile = currentFile.Parent {
    // Intentionally empty body
}

At this point the directory sizes are still 0, so we need a function to recursively walk the tree (depth first) and complete the sizes.

func setDirectorySize(directory *File) {
	directory.Size = 0

	for c := range directory.Children {
		if directory.Children[c].FileType == TypeDirectory {
			setDirectorySize(&directory.Children[c])
		}

		directory.Size += directory.Children[c].Size
	}
}

Finally, we need to find all the directories with a size less than or equal to 100000. We could walk the tree to find them, but an alternative is to ‘flatten’ the tree into a map and then iterate over the map. This works because at this point we are not interested in the order of the directories or their position within the tree - we just want to process them all. Flattening the tree may be useful in future exercises as well, so hopefully we can re-use this function (or something similar).

func flattenTree(file *File) map[string]File {
	files := make(map[string]File)

	// Add the current file
	files[file.Path] = File{
		Name:     file.Name,
		Path:     file.Path,
		FileType: file.FileType,
		Size:     file.Size,
	}

	// Add all the children, recursively
	// Note: We may 'add' a file multiple times, however as the path is used
	// as the map key, they will only appear once
	for c := range file.Children {
		childFiles := flattenTree(&file.Children[c])

		for key, value := range childFiles {
			files[childFiles[key].Path] = File{
				Name:     value.Name,
				Path:     value.Path,
				FileType: value.FileType,
				Size:     value.Size,
			}
		}
	}

	return files
}

The source for the full solution:

package main

import (
	"bufio"
	"fmt"
	"os"
	"regexp"
	"strconv"
	"strings"
)

const (
	TypeRegularFile = iota
	TypeDirectory
)

const MaxTotalSize = 100000

type File struct {
	Name     string
	Path     string
	FileType int
	Size     int
	Parent   *File
	Children []File
}

func getFileInDirectory(directory *File, fileName string) *File {
	for c := range directory.Children {
		if directory.Children[c].Name == fileName {
			return &directory.Children[c]
		}
	}

	return nil
}

func setDirectorySize(directory *File) {
	directory.Size = 0

	for c := range directory.Children {
		if directory.Children[c].FileType == TypeDirectory {
			setDirectorySize(&directory.Children[c])
		}

		directory.Size += directory.Children[c].Size
	}
}

func flattenTree(file *File) map[string]File {
	files := make(map[string]File)

	// Add the current file
	files[file.Path] = File{
		Name:     file.Name,
		Path:     file.Path,
		FileType: file.FileType,
		Size:     file.Size,
	}

	// Add all the children, recursively
	// Note: We may 'add' a file multiple times, however as the path is used
	// as the map key, they will only appear once
	for c := range file.Children {
		childFiles := flattenTree(&file.Children[c])

		for key, value := range childFiles {
			files[childFiles[key].Path] = File{
				Name:     value.Name,
				Path:     value.Path,
				FileType: value.FileType,
				Size:     value.Size,
			}
		}
	}

	return files
}

func main() {
	regularFileRegex := regexp.MustCompile(`^\d+\s+[a-z\.]+$`)

	// Define the root file as our filesystem must have one
	root := &File{
		Name:     "",
		Path:     "/",
		FileType: TypeDirectory,
		Size:     0,
	}

	currentFile := root

	scanner := bufio.NewScanner(os.Stdin)

	// Read in each line and convert into tree
	for scanner.Scan() {
		line := strings.TrimSpace(scanner.Text())

		if strings.Index(line, "$") == 0 {
			// Command
			command := strings.Fields(line)

			if command[1] == "cd" {
				// Change into a directory
				if command[2] == ".." {
					// Go up one level if we are not already at the top level
					if currentFile.Parent != nil {
						currentFile = currentFile.Parent
					}
				} else {
					// Change into a directory - create it if it does not already exist
					directoryName := command[2]

					if strings.Index(directoryName, "/") == 0 {
						// Absolute change of directory, find path
					} else {
						directory := getFileInDirectory(currentFile, directoryName)

						if directory != nil {
							currentFile = directory
						} else {
							filePath := currentFile.Path + "/" + directoryName

							// If there are two leading forward slashes, remove one
							if strings.Index(filePath, "//") == 0 {
								filePath = filePath[1:]
							}

							newDirectory := &File{
								Name:     directoryName,
								Path:     filePath,
								FileType: TypeDirectory,
								Size:     0,
								Parent:   currentFile,
							}

							currentFile.Children = append(currentFile.Children, *newDirectory)
							currentFile = newDirectory
						}
					}
				}
			} else if command[1] == "ls" {
				// Listing directory - we can skip this as we will process the contents
				// when we read the next lines
			} else {
				fmt.Println("Unexpected command: " + command[1])
				os.Exit(1)
			}
		} else {
			isDirectory := strings.Index(line, "dir") == 0
			isRegularFile := regularFileRegex.MatchString(line)

			if isDirectory || isRegularFile {
				fileParts := strings.Fields(line)

				// Assume we are working with a directory, then override
				// if we have a regular file
				fileSize := 0
				fileName := fileParts[1]
				fileType := TypeDirectory

				if isRegularFile {
					fileSize, _ = strconv.Atoi(fileParts[0])
					fileType = TypeRegularFile
				}

				// Check if file exists
				// This is necessary because we may have seen this file already,
				// for example if we have changed into a directory and run ls in its parent
				existingFile := getFileInDirectory(currentFile, fileName)

				if existingFile == nil {
					// Add file to this level of the tree
					filePath := currentFile.Path + "/" + fileName

					// If there are two leading forward slashes, remove one
					if strings.Index(filePath, "//") == 0 {
						filePath = filePath[1:]
					}

					newFile := File{
						Name:     fileName,
						Path:     filePath,
						FileType: fileType,
						Size:     fileSize,
						Parent:   currentFile,
					}

					currentFile.Children = append(currentFile.Children, newFile)
				}
			}
		}
	}

	// Recursively walk the tree and populate all directory sizes
	// Go all the way up to the top
	for ; currentFile.Parent != nil; currentFile = currentFile.Parent {
		// Intentionally empty body
	}

	// Set the directory sizes from the root down
	setDirectorySize(currentFile)

	// Flatten the tree into a map of path -> file so we can process it iteratively
	flattenedTree := flattenTree(currentFile)

	underSizeSum := 0

	for _, file := range flattenedTree {
		if file.FileType == TypeDirectory && file.Size <= MaxTotalSize {
			underSizeSum += file.Size
		}
	}

	fmt.Println(underSizeSum)
}

Part 2

Problem

The same as part 1, except we need to find the smallest directory that can be deleted to ensure we have at least 30000000 free space.

General solution

The same as part 1, except we need to calculate the current free space (before deletions) and the amount we need to free up (to reach 30000000). We then iterate over the flattened map, starting at the root file (which deleting will by definition free up enough space because it represents the entire filesystem), and keep a track of the smallest directory that is greater than or equal to the required free space.

Solution: Go

The same as part 1, except with some extra constants and calculations to find the required free space, and then a slightly different loop at the end.

package main

import (
	"bufio"
	"fmt"
	"os"
	"regexp"
	"strconv"
	"strings"
)

const (
	TypeRegularFile = iota
	TypeDirectory
)

const TotalDiskSpace = 70000000
const UpgradeSpaceRequirement = 30000000

type File struct {
	Name     string
	Path     string
	FileType int
	Size     int
	Parent   *File
	Children []File
}

func getFileInDirectory(directory *File, fileName string) *File {
	for c := range directory.Children {
		if directory.Children[c].Name == fileName {
			return &directory.Children[c]
		}
	}

	return nil
}

func setDirectorySize(directory *File) {
	directory.Size = 0

	for c := range directory.Children {
		if directory.Children[c].FileType == TypeDirectory {
			setDirectorySize(&directory.Children[c])
		}

		directory.Size += directory.Children[c].Size
	}
}

func flattenTree(file *File) map[string]File {
	files := make(map[string]File)

	// Add the current file
	files[file.Path] = File{
		Name:     file.Name,
		Path:     file.Path,
		FileType: file.FileType,
		Size:     file.Size,
	}

	// Add all the children, recursively
	// Note: We may 'add' a file multiple times, however as the path is used
	// as the map key, they will only appear once
	for c := range file.Children {
		childFiles := flattenTree(&file.Children[c])

		for key, value := range childFiles {
			files[childFiles[key].Path] = File{
				Name:     value.Name,
				Path:     value.Path,
				FileType: value.FileType,
				Size:     value.Size,
			}
		}
	}

	return files
}

func main() {
	regularFileRegex := regexp.MustCompile(`^\d+\s+[a-z\.]+$`)

	// Define the root file as our filesystem must have one
	root := &File{
		Name:     "",
		Path:     "/",
		FileType: TypeDirectory,
		Size:     0,
	}

	currentFile := root

	scanner := bufio.NewScanner(os.Stdin)

	// Read in each line and convert into tree
	for scanner.Scan() {
		line := strings.TrimSpace(scanner.Text())

		if strings.Index(line, "$") == 0 {
			// Command
			command := strings.Fields(line)

			if command[1] == "cd" {
				// Change into a directory
				if command[2] == ".." {
					// Go up one level if we are not already at the top level
					if currentFile.Parent != nil {
						currentFile = currentFile.Parent
					}
				} else {
					// Change into a directory - create it if it does not already exist
					directoryName := command[2]

					if strings.Index(directoryName, "/") == 0 {
						// Absolute change of directory, find path
					} else {
						directory := getFileInDirectory(currentFile, directoryName)

						if directory != nil {
							currentFile = directory
						} else {
							filePath := currentFile.Path + "/" + directoryName

							// If there are two leading forward slashes, remove one
							if strings.Index(filePath, "//") == 0 {
								filePath = filePath[1:]
							}

							newDirectory := &File{
								Name:     directoryName,
								Path:     filePath,
								FileType: TypeDirectory,
								Size:     0,
								Parent:   currentFile,
							}

							currentFile.Children = append(currentFile.Children, *newDirectory)
							currentFile = newDirectory
						}
					}
				}
			} else if command[1] == "ls" {
				// Listing directory - we can skip this as we will process the contents
				// when we read the next lines
			} else {
				fmt.Println("Unexpected command: " + command[1])
				os.Exit(1)
			}
		} else {
			isDirectory := strings.Index(line, "dir") == 0
			isRegularFile := regularFileRegex.MatchString(line)

			if isDirectory || isRegularFile {
				fileParts := strings.Fields(line)

				// Assume we are working with a directory, then override
				// if we have a regular file
				fileSize := 0
				fileName := fileParts[1]
				fileType := TypeDirectory

				if isRegularFile {
					fileSize, _ = strconv.Atoi(fileParts[0])
					fileType = TypeRegularFile
				}

				// Check if file exists
				// This is necessary because we may have seen this file already,
				// for example if we have changed into a directory and run ls in its parent
				existingFile := getFileInDirectory(currentFile, fileName)

				if existingFile == nil {
					// Add file to this level of the tree
					filePath := currentFile.Path + "/" + fileName

					// If there are two leading forward slashes, remove one
					if strings.Index(filePath, "//") == 0 {
						filePath = filePath[1:]
					}

					newFile := File{
						Name:     fileName,
						Path:     filePath,
						FileType: fileType,
						Size:     fileSize,
						Parent:   currentFile,
					}

					currentFile.Children = append(currentFile.Children, newFile)
				}
			}
		}
	}

	// Recursively walk the tree and populate all directory sizes
	// Go all the way up to the top
	for ; currentFile.Parent != nil; currentFile = currentFile.Parent {
		// Intentionally empty body
	}

	// Set the directory sizes from the root down
	setDirectorySize(currentFile)

	currentFreeSpace := TotalDiskSpace - currentFile.Size
	extraSpaceRequired := UpgradeSpaceRequirement - currentFreeSpace

	// Flatten the tree into a map of path -> file so we can process it iteratively
	flattenedTree := flattenTree(currentFile)

	// Deleting the root directory will be definition free up enough space
	deleteDirectory := flattenedTree["/"]

	for _, file := range flattenedTree {
		if file.FileType == TypeDirectory && file.Size >= extraSpaceRequired && file.Size < deleteDirectory.Size {
			deleteDirectory = file
		}
	}

	fmt.Println(deleteDirectory.Size)
}

Post-solution thoughts

It has been pointed out that there are quick and dirty solutions which would involve less code - strictly there is no need to build a tree and a map would probably suffice. However, designing, populating and processing a tree was a good thought exercise, and part of the fun of Advent of Code is learning new things (or re-learning topics from my undergraduate degree).