在Haskell中从文件中读取树

Reading a tree from file in Haskell

我想从haskell中的一个文件创建一个树。为此，我将文件读取到列表中：

列表中每个元素的名称都遵循此模式：

1
2
3
4
5
6
7

["Name","Dad","Mum"]

[["Bob","Dylan","Susan"],
["Dylan","Cole","Sarah"],
["Cole","Patrick","Patricia"],
["Sarah","David","Fiona"],
["Susan","Michael","Madeline"]]

所需输出如下：

1
2
3
4
5
6
7
8
9
10
11

Bob
Dylan
Cole
Patrick
Patricia
Sarah
David
Fiona
Susan
Michael
Madeline

号

空间可以是一个标签，我只是更强调我的观点。

以下是迄今为止我所做的：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32

data Tree a = EmptyTree | Node a (Tree a) (Tree a) deriving (Show, Read, Eq)

singleton :: a -> Tree a
singleton x = Node x EmptyTree EmptyTree

treeInsert :: (Ord a) => a -> Tree a -> Tree a
treeInsert x EmptyTree = singleton x
treeInsert x (Node a left right)
| x == a = Node x left right
| x < a = Node a (treeInsert x left) right
| x > a = Node a left (treeInsert x right)

createTree :: (Ord a) => [a] -> Tree a
createTree [] = EmptyTree
createTree (x:xs) = createTree2 (Node x EmptyTree EmptyTree) xs
where
createTree2 tree [] = tree
createTree2 tree (y:ys) = createTree2 (treeInsert y tree) ys

printTree :: Show a => Tree a -> IO ()
printTree = (mapM_ putStrLn) . treeIndent
where
treeIndent EmptyTree = ["
Empty Tree
"]
treeIndent (Node v lb rb) =
[(show v)] ++
map (" " ++) ls ++
("" ++ r) : map (" " ++) rs
where
(r:rs) = treeIndent $ rb
ls = treeIndent $ lb

所有这些让我，基本上创建了树，并打印到屏幕上。我正在努力的是根据这个问题正确地创建树。

相关讨论

如果我理解正确，那么这个问题有两个方面的问题：创建树，并以所需的样式打印它。我会一一说明：

创建树

这个问题的症结在于输入数据是以我要称之为关联列表的形式出现的，它将每个父节点与两个子节点关联起来。此列表对如何构建树进行了约束，但如何根据这些约束进行操作可能不是很明显(值得注意的是，它们没有指定唯一的树)。我编写这个函数是为了完成它，使用您的Tree类型：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

data Tree a = EmptyTree | Node a (Tree a) (Tree a)

toTree :: [[String]] -> Tree String
toTree list = toTree' root
where
-- both these are extremely unsafe, as they assume that the input is a list
-- of lists each with length three
root = fst $ head mapping
mapping :: [(String, (String, String))]
mapping = fmap (\(p:c1:c2:[]) -> (p, (c1, c2))) list

-- Recursively build our tree, using the association list defined above to
-- look up the children for each node. If there are no children, we return
-- a node with EmptyTree children instead.
toTree' root = let childs = lookup root mapping
in maybe (Node root EmptyTree EmptyTree)
(\(l, r) -> Node root (toTree' l) (toTree' r))
childs

此函数将您的列表输入转换为一个名为mapping的[(String, (String, String))]。使用Lists的lookup函数，我们可以使用mapping作为关联列表，搜索与父String关联的子(String, String)。

然后，我们使用toTree'函数递归地构建树。在每个节点上，它为该节点的子节点执行对mapping关联列表的查找。如果有子级，它会递归地将它们添加到树中。以这种方式构建树意味着输入元组可以是任意顺序的。这里对List的lookup功能的使用效率很低，如果考虑到性能，可以使用Data.Map来代替。

打印树

您的方法使用递归，这可能是实现这一点的最简单方法，但是您仍然尝试收集所有输出的列表，然后在最后对其执行mapM。我认为，在遍历树时简单地输出节点内容比较容易，除非有一些原因不这样做(如果想收集输出，可以使用Writermonad而不是IO)。

我的方法使用一个Int计数器来跟踪缩进量：

1
2
3
4
5
6
7
8
9
10
11
12
13

printTree :: Tree String -> IO ()
printTree t = printTree' t 0
where
-- if we reached the bottom of the tree, do nothing
printTree' EmptyTree _ = return ()

-- We first print the current node's string value, and then recursively
-- call ourselves for the children. This is a simple depth-first tree
-- traversal, for which binary trees are well-suited.
printTree' (Node s l r) depth = do
putStrLn $ replicate depth ' ' ++ s
printTree' l (depth + 2)
printTree' r (depth + 2)

号

输出格式良好：

1
2
3
4
5
6
7
8
9
10
11

Bob
Dylan
Cole
Patrick
Patricia
Sarah
David
Fiona
Susan
Michael
Madeline

另一种选择

我怀疑这是一个家庭作业问题或类似的问题，这可能使得使用二叉树是不可协商的，但是在这里很容易进行深度优先遍历相邻列表，而不必将其转换为二叉树(算法看起来很相似)：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

simpleTreePrint :: [[String]] -> IO ()
simpleTreePrint list = p' (fst $ head mapping) 0
where
-- this recursive function prints the 'root' name (eg"Bob") that it is
-- called with, then recursively calls itself for all the children of
-- that name that it finds in the 'mapping' data structure
p' :: String -> Int -> IO ()
p' root depth = let children = maybe [] id $ lookup root mapping
in do putStrLn $ replicate depth ' ' ++ root
forM_ children $ \c -> p' c (depth + 2)

-- to make child lookups easier, we convert the original list of lists
-- of names into tuples whose first values are the 'parent' name, and
-- whose second values are the remaining names. This allows us to use the
-- regular List lookup function, which is not efficient but may suffice
-- for this application
mapping :: [(String, [String])]
mapping = fmap (\(p:cs) -> (p, cs)) list

。

这种方法将您的输入数据集更像是一个广义图。它可以处理有两个以上孩子的家长，并且有了更复杂的输入数据集，我们可以利用图形方法来做更酷的事情。不过，我们的simpleTreePrint函数可能会中断，因为它只在输入数据严格为树时才真正起作用。

考虑问题的更一般版本可能会更简单。也就是说，考虑一个[(a, Maybe a, Maybe a)]类型的列表。通过将每个第一个元素作为节点，您可以从此列表中构建树(更具体地说，是树列表)，第二和第三个元素对应于分支-如果它们是Nothing，则分支是Nil。否则，该函数的语义与您要编写的语义完全对应。

首先，编写一个用于编码此逻辑的助手函数：

1
2
3
4

lookupDef :: Eq a => Maybe a -> [(a, Tree a)] -> Tree a
lookupDef Nothing _ = Nil
lookupDef (Just a) xs | Just r <- lookup a xs = r
| otherwise = Node a Nil Nil

第二个参数是与其余名称的树对应的(键、值)对的现有列表。然后，如果要查找的值不是空的，而是键中的一个，那么它是一个"终端"值，因此只需在树中单独返回它。

然后，构造上述(键、值)对列表的中间函数。

1
2
3
4

readTreeList :: Eq a => [(a, Maybe a, Maybe a)] -> [(a, Tree a)]
readTreeList [] = []
readTreeList xs@(_:_) = result where
result = [ (p, Node p ? ?) | (p, l, r) <- xs ]

。

上面应该很明显：输入列表中的每个键都必须对应于输出中的一个值。该键的树将是一个Node p q r，其中q/r是对应于l/r的树。上面的第一个函数将计算q和r。这是lookupDef函数的来源：

1	result = [ (p, Node p (lookupDef l ?) (lookupDef r ?)) \| (p, l, r) <- xs ]

。

但是要在其中查找子树的列表是什么？我们只有一个这样的列表是result，所以让我们尝试一下：

1 2	result = [ (p, Node p (lookupDef l result) (lookupDef r result)) \| (p, l, r) <- xs ]

由于懒惰的魔力，这实际上是可行的。

然后，要从中获得一个树，只需获取结果的第一个元素(您的示例输入指示应将第一个元素用作根元素)。实际上，您可以将此部分与上面的内容内联：

1
2
3
4
5

readTree :: Eq a => [(a, Maybe a, Maybe a)] -> Tree a
readTree [] = Nil
readTree xs@(_:_) = snd h where
result@(h:_) = [ (p, Node p (lookupDef l result) (lookupDef r result))
| (p, l, r) <- xs ]

。

然后你的数据：

1
2
3
4
5
6

test = map (\([x,y,z]) -> (x, Just y, Just z))
[["Dylan","Cole","Sarah"],
["Sarah","David","Fiona"],
["Bob","Dylan","Susan"],
["Cole","Patrick","Patricia"],
["Susan","Michael","Madeline"]]

结果是：

1
2
3
4
5
6
7
8
9
10
11
12

。

如果数据类型不是存储键值对(data.map)的列表，这当然会更快，但这是另一个问题。

请注意，我稍微修改了/添加了定义，但这与上述代码无关：

1
2
3
4
5
6

{-# LANGUAGE DeriveFunctor, DeriveFoldable #-}

import qualified Data.Foldable as F

data Tree a = Nil | Node a (Tree a) (Tree a)
deriving (Show, Read, Eq, Functor, F.Foldable)

号

这给了你fmap和toList。

1
2
3
4
5
6

formatTree Nil = Nil
formatTree (Node a l r) = Node (show a)
(fmap (" |" ++) $ formatTree l)
(fmap (" |" ++) $ formatTree r)

printTree x = putStrLn . unlines . F.toList . formatTree $ x

号

这为您提供了一个更简单的漂亮打印功能。