2022 Day 7
Part 1
Problem
Read in a list of lines which represent one of the following:
- Command: Starts with a
$
. - Directory: Starts with
dir
and is followed by the name. - Regular File: Starts with a number (the file size) and is followed by the name.
The size of a directory is the sum of the regular files in the directory, plus any directories underneath it.
Find the sum of the sizes of all directories with a size less than or equal to 100000.
General solution
As we have a root directory and each directory can contain other directories and files, a tree seems the best fit data structure. Additionally, we will follow the Unix design in treating directories as files.
Files will have the following fields:
- Name: This must be unique within a directory but may not be unique within the filesystem.
- Path: Full absolute path to a file. Unique within the file system.
- Type: Regular file or directory. We need this because we cannot tell from the other fields when a file is a regular file or a directory (e.g. whilst all regular files will have no children, it is also possible for a directory to have no children).
- Size: For a regular file this will be the file size listed in the input. For a directory it will be the sum of all its children.
- Children: A list of all the child files. Regular files will have no children, directories will have zero or more children.
- Parent: Although we could find the parent file by searching the tree, including a field for it will make it easier to move up the filesystem as well as down. All files will have exactly one parent, except the root file.
The tree will start with a special root file, which will be a directory. Everything else will be an descendant of this file.
Solution: Go
First of all, we need to design a data structure for our tree. Two important things I learnt here are:
Struct fields should be capitalised if they are public, e.g. Name
rather than name
(this doesn’t matter too much in these exercises as everything is contained within the same package, and visibility only applies at the package level).
Structs cannot directly contain fields of the same struct. They can however contain a field which is a pointer to the same struct.
As an example:
type File struct {
Parent File
}
will not compile. The correct code is:
type File struct {
Parent *File
}
This makes sense and is similar to how self-referencing structs are used in C.
The data structure design ends up as:
type File struct {
Name string
Path string
FileType int
Size int
Parent *File
Children []File
}
Go will initialise struct fields with ’empty’ values, such as 0 for int, which allows us to create a File before we know its size. This differs from C, where the value of an uninitialised variable or field is undefined - and therefore could be anything.
We then read in every line of the file and work out whether it is a command, directory, or regular file, and add items to the filesystem tree as appropriate.
The first step is to create the root file, as every filesystem must have a root:
root := &File{
Name: "",
Path: "/",
FileType: TypeDirectory,
Size: 0,
}
We then check the line type (command, directory, regular file) and process as follows:
Command: If the command is cd
, we either move up one level (..
) or create that directory in our filesystem tree (under the current file, and only if it does not exist) and then change the current file to be that directory. If the command is ls
we do nothing because we will process the directory listing when we read in the following lines.
Directory or Regular File: We split the two parts of the line up. The second part is always the name of the file, and the first line is either dir
(for a directory) or the file size (for a regular file). We populate a File struct and add it to the children of the current directory. If the new file is a directory, we leave the size as 0 because we cannot calculate the size of a directory until we have all of its descendants.
After processing the input, we now have a filesystem tree. At this point we could be anywhere within the tree, but we need to ‘walk’ up to the top as our next operation will start from the root. This is trivial as we have a File.parent
field, and we can tell when we have reached the root because the parent field will be nil
.
for ; currentFile.Parent != nil; currentFile = currentFile.Parent {
// Intentionally empty body
}
At this point the directory sizes are still 0, so we need a function to recursively walk the tree (depth first) and complete the sizes.
func setDirectorySize(directory *File) {
directory.Size = 0
for c := range directory.Children {
if directory.Children[c].FileType == TypeDirectory {
setDirectorySize(&directory.Children[c])
}
directory.Size += directory.Children[c].Size
}
}
Finally, we need to find all the directories with a size less than or equal to 100000. We could walk the tree to find them, but an alternative is to ‘flatten’ the tree into a map and then iterate over the map. This works because at this point we are not interested in the order of the directories or their position within the tree - we just want to process them all. Flattening the tree may be useful in future exercises as well, so hopefully we can re-use this function (or something similar).
func flattenTree(file *File) map[string]File {
files := make(map[string]File)
// Add the current file
files[file.Path] = File{
Name: file.Name,
Path: file.Path,
FileType: file.FileType,
Size: file.Size,
}
// Add all the children, recursively
// Note: We may 'add' a file multiple times, however as the path is used
// as the map key, they will only appear once
for c := range file.Children {
childFiles := flattenTree(&file.Children[c])
for key, value := range childFiles {
files[childFiles[key].Path] = File{
Name: value.Name,
Path: value.Path,
FileType: value.FileType,
Size: value.Size,
}
}
}
return files
}
The source for the full solution:
package main
import (
"bufio"
"fmt"
"os"
"regexp"
"strconv"
"strings"
)
const (
TypeRegularFile = iota
TypeDirectory
)
const MaxTotalSize = 100000
type File struct {
Name string
Path string
FileType int
Size int
Parent *File
Children []File
}
func getFileInDirectory(directory *File, fileName string) *File {
for c := range directory.Children {
if directory.Children[c].Name == fileName {
return &directory.Children[c]
}
}
return nil
}
func setDirectorySize(directory *File) {
directory.Size = 0
for c := range directory.Children {
if directory.Children[c].FileType == TypeDirectory {
setDirectorySize(&directory.Children[c])
}
directory.Size += directory.Children[c].Size
}
}
func flattenTree(file *File) map[string]File {
files := make(map[string]File)
// Add the current file
files[file.Path] = File{
Name: file.Name,
Path: file.Path,
FileType: file.FileType,
Size: file.Size,
}
// Add all the children, recursively
// Note: We may 'add' a file multiple times, however as the path is used
// as the map key, they will only appear once
for c := range file.Children {
childFiles := flattenTree(&file.Children[c])
for key, value := range childFiles {
files[childFiles[key].Path] = File{
Name: value.Name,
Path: value.Path,
FileType: value.FileType,
Size: value.Size,
}
}
}
return files
}
func main() {
regularFileRegex := regexp.MustCompile(`^\d+\s+[a-z\.]+$`)
// Define the root file as our filesystem must have one
root := &File{
Name: "",
Path: "/",
FileType: TypeDirectory,
Size: 0,
}
currentFile := root
scanner := bufio.NewScanner(os.Stdin)
// Read in each line and convert into tree
for scanner.Scan() {
line := strings.TrimSpace(scanner.Text())
if strings.Index(line, "$") == 0 {
// Command
command := strings.Fields(line)
if command[1] == "cd" {
// Change into a directory
if command[2] == ".." {
// Go up one level if we are not already at the top level
if currentFile.Parent != nil {
currentFile = currentFile.Parent
}
} else {
// Change into a directory - create it if it does not already exist
directoryName := command[2]
if strings.Index(directoryName, "/") == 0 {
// Absolute change of directory, find path
} else {
directory := getFileInDirectory(currentFile, directoryName)
if directory != nil {
currentFile = directory
} else {
filePath := currentFile.Path + "/" + directoryName
// If there are two leading forward slashes, remove one
if strings.Index(filePath, "//") == 0 {
filePath = filePath[1:]
}
newDirectory := &File{
Name: directoryName,
Path: filePath,
FileType: TypeDirectory,
Size: 0,
Parent: currentFile,
}
currentFile.Children = append(currentFile.Children, *newDirectory)
currentFile = newDirectory
}
}
}
} else if command[1] == "ls" {
// Listing directory - we can skip this as we will process the contents
// when we read the next lines
} else {
fmt.Println("Unexpected command: " + command[1])
os.Exit(1)
}
} else {
isDirectory := strings.Index(line, "dir") == 0
isRegularFile := regularFileRegex.MatchString(line)
if isDirectory || isRegularFile {
fileParts := strings.Fields(line)
// Assume we are working with a directory, then override
// if we have a regular file
fileSize := 0
fileName := fileParts[1]
fileType := TypeDirectory
if isRegularFile {
fileSize, _ = strconv.Atoi(fileParts[0])
fileType = TypeRegularFile
}
// Check if file exists
// This is necessary because we may have seen this file already,
// for example if we have changed into a directory and run ls in its parent
existingFile := getFileInDirectory(currentFile, fileName)
if existingFile == nil {
// Add file to this level of the tree
filePath := currentFile.Path + "/" + fileName
// If there are two leading forward slashes, remove one
if strings.Index(filePath, "//") == 0 {
filePath = filePath[1:]
}
newFile := File{
Name: fileName,
Path: filePath,
FileType: fileType,
Size: fileSize,
Parent: currentFile,
}
currentFile.Children = append(currentFile.Children, newFile)
}
}
}
}
// Recursively walk the tree and populate all directory sizes
// Go all the way up to the top
for ; currentFile.Parent != nil; currentFile = currentFile.Parent {
// Intentionally empty body
}
// Set the directory sizes from the root down
setDirectorySize(currentFile)
// Flatten the tree into a map of path -> file so we can process it iteratively
flattenedTree := flattenTree(currentFile)
underSizeSum := 0
for _, file := range flattenedTree {
if file.FileType == TypeDirectory && file.Size <= MaxTotalSize {
underSizeSum += file.Size
}
}
fmt.Println(underSizeSum)
}
Part 2
Problem
The same as part 1, except we need to find the smallest directory that can be deleted to ensure we have at least 30000000 free space.
General solution
The same as part 1, except we need to calculate the current free space (before deletions) and the amount we need to free up (to reach 30000000). We then iterate over the flattened map, starting at the root file (which deleting will by definition free up enough space because it represents the entire filesystem), and keep a track of the smallest directory that is greater than or equal to the required free space.
Solution: Go
The same as part 1, except with some extra constants and calculations to find the required free space, and then a slightly different loop at the end.
package main
import (
"bufio"
"fmt"
"os"
"regexp"
"strconv"
"strings"
)
const (
TypeRegularFile = iota
TypeDirectory
)
const TotalDiskSpace = 70000000
const UpgradeSpaceRequirement = 30000000
type File struct {
Name string
Path string
FileType int
Size int
Parent *File
Children []File
}
func getFileInDirectory(directory *File, fileName string) *File {
for c := range directory.Children {
if directory.Children[c].Name == fileName {
return &directory.Children[c]
}
}
return nil
}
func setDirectorySize(directory *File) {
directory.Size = 0
for c := range directory.Children {
if directory.Children[c].FileType == TypeDirectory {
setDirectorySize(&directory.Children[c])
}
directory.Size += directory.Children[c].Size
}
}
func flattenTree(file *File) map[string]File {
files := make(map[string]File)
// Add the current file
files[file.Path] = File{
Name: file.Name,
Path: file.Path,
FileType: file.FileType,
Size: file.Size,
}
// Add all the children, recursively
// Note: We may 'add' a file multiple times, however as the path is used
// as the map key, they will only appear once
for c := range file.Children {
childFiles := flattenTree(&file.Children[c])
for key, value := range childFiles {
files[childFiles[key].Path] = File{
Name: value.Name,
Path: value.Path,
FileType: value.FileType,
Size: value.Size,
}
}
}
return files
}
func main() {
regularFileRegex := regexp.MustCompile(`^\d+\s+[a-z\.]+$`)
// Define the root file as our filesystem must have one
root := &File{
Name: "",
Path: "/",
FileType: TypeDirectory,
Size: 0,
}
currentFile := root
scanner := bufio.NewScanner(os.Stdin)
// Read in each line and convert into tree
for scanner.Scan() {
line := strings.TrimSpace(scanner.Text())
if strings.Index(line, "$") == 0 {
// Command
command := strings.Fields(line)
if command[1] == "cd" {
// Change into a directory
if command[2] == ".." {
// Go up one level if we are not already at the top level
if currentFile.Parent != nil {
currentFile = currentFile.Parent
}
} else {
// Change into a directory - create it if it does not already exist
directoryName := command[2]
if strings.Index(directoryName, "/") == 0 {
// Absolute change of directory, find path
} else {
directory := getFileInDirectory(currentFile, directoryName)
if directory != nil {
currentFile = directory
} else {
filePath := currentFile.Path + "/" + directoryName
// If there are two leading forward slashes, remove one
if strings.Index(filePath, "//") == 0 {
filePath = filePath[1:]
}
newDirectory := &File{
Name: directoryName,
Path: filePath,
FileType: TypeDirectory,
Size: 0,
Parent: currentFile,
}
currentFile.Children = append(currentFile.Children, *newDirectory)
currentFile = newDirectory
}
}
}
} else if command[1] == "ls" {
// Listing directory - we can skip this as we will process the contents
// when we read the next lines
} else {
fmt.Println("Unexpected command: " + command[1])
os.Exit(1)
}
} else {
isDirectory := strings.Index(line, "dir") == 0
isRegularFile := regularFileRegex.MatchString(line)
if isDirectory || isRegularFile {
fileParts := strings.Fields(line)
// Assume we are working with a directory, then override
// if we have a regular file
fileSize := 0
fileName := fileParts[1]
fileType := TypeDirectory
if isRegularFile {
fileSize, _ = strconv.Atoi(fileParts[0])
fileType = TypeRegularFile
}
// Check if file exists
// This is necessary because we may have seen this file already,
// for example if we have changed into a directory and run ls in its parent
existingFile := getFileInDirectory(currentFile, fileName)
if existingFile == nil {
// Add file to this level of the tree
filePath := currentFile.Path + "/" + fileName
// If there are two leading forward slashes, remove one
if strings.Index(filePath, "//") == 0 {
filePath = filePath[1:]
}
newFile := File{
Name: fileName,
Path: filePath,
FileType: fileType,
Size: fileSize,
Parent: currentFile,
}
currentFile.Children = append(currentFile.Children, newFile)
}
}
}
}
// Recursively walk the tree and populate all directory sizes
// Go all the way up to the top
for ; currentFile.Parent != nil; currentFile = currentFile.Parent {
// Intentionally empty body
}
// Set the directory sizes from the root down
setDirectorySize(currentFile)
currentFreeSpace := TotalDiskSpace - currentFile.Size
extraSpaceRequired := UpgradeSpaceRequirement - currentFreeSpace
// Flatten the tree into a map of path -> file so we can process it iteratively
flattenedTree := flattenTree(currentFile)
// Deleting the root directory will be definition free up enough space
deleteDirectory := flattenedTree["/"]
for _, file := range flattenedTree {
if file.FileType == TypeDirectory && file.Size >= extraSpaceRequired && file.Size < deleteDirectory.Size {
deleteDirectory = file
}
}
fmt.Println(deleteDirectory.Size)
}
Post-solution thoughts
It has been pointed out that there are quick and dirty solutions which would involve less code - strictly there is no need to build a tree and a map would probably suffice. However, designing, populating and processing a tree was a good thought exercise, and part of the fun of Advent of Code is learning new things (or re-learning topics from my undergraduate degree).