在Haskell中从文件中读取树

Reading a tree from file in Haskell

我想从haskell中的一个文件创建一个树。为此,我将文件读取到列表中:

列表中每个元素的名称都遵循此模式:

1
2
3
4
5
6
7
["Name","Dad","Mum"]

[["Bob","Dylan","Susan"],
 ["Dylan","Cole","Sarah"],
 ["Cole","Patrick","Patricia"],
 ["Sarah","David","Fiona"],
 ["Susan","Michael","Madeline"]]

所需输出如下:

1
2
3
4
5
6
7
8
9
10
11
Bob
      Dylan
            Cole
                  Patrick
                  Patricia
            Sarah
                  David
                  Fiona
      Susan
            Michael
            Madeline

空间可以是一个标签,我只是更强调我的观点。

以下是迄今为止我所做的:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
data Tree a = EmptyTree | Node a (Tree a) (Tree a) deriving (Show, Read, Eq)

singleton :: a -> Tree a  
singleton x = Node x EmptyTree EmptyTree  

treeInsert :: (Ord a) => a -> Tree a -> Tree a  
treeInsert x EmptyTree = singleton x  
treeInsert x (Node a left right)  
    | x == a = Node x left right  
    | x < a  = Node a (treeInsert x left) right  
    | x > a  = Node a left (treeInsert x right)  

createTree :: (Ord a) => [a] -> Tree a
createTree [] = EmptyTree
createTree (x:xs) = createTree2 (Node x EmptyTree EmptyTree) xs
  where
    createTree2 tree [] = tree
    createTree2 tree (y:ys) = createTree2 (treeInsert y tree) ys

printTree :: Show a => Tree a -> IO ()
printTree = (mapM_ putStrLn) . treeIndent
  where
    treeIndent EmptyTree          = ["
Empty Tree
"
]
    treeIndent (Node v lb rb) =
      [(show v)] ++
      map ("     " ++) ls ++
      ("" ++ r) : map ("  " ++) rs
    where
        (r:rs) = treeIndent $ rb
        ls     = treeIndent $ lb

所有这些让我,基本上创建了树,并打印到屏幕上。我正在努力的是根据这个问题正确地创建树。


如果我理解正确,那么这个问题有两个方面的问题:创建树,并以所需的样式打印它。我会一一说明:

创建树

这个问题的症结在于输入数据是以我要称之为关联列表的形式出现的,它将每个父节点与两个子节点关联起来。此列表对如何构建树进行了约束,但如何根据这些约束进行操作可能不是很明显(值得注意的是,它们没有指定唯一的树)。我编写这个函数是为了完成它,使用您的Tree类型:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
data Tree a = EmptyTree | Node a (Tree a) (Tree a)

toTree :: [[String]] -> Tree String
toTree list = toTree' root
  where
    -- both these are extremely unsafe, as they assume that the input is a list
    -- of lists each with length three
    root = fst $ head mapping
    mapping :: [(String, (String, String))]
    mapping = fmap (\(p:c1:c2:[]) -> (p, (c1, c2))) list

    -- Recursively build our tree, using the association list defined above to
    -- look up the children for each node. If there are no children, we return
    -- a node with EmptyTree children instead.
    toTree' root = let childs = lookup root mapping
                   in  maybe (Node root EmptyTree EmptyTree)
                             (\(l, r) -> Node root (toTree' l) (toTree' r))
                             childs

此函数将您的列表输入转换为一个名为mapping[(String, (String, String))]。使用Lists的lookup函数,我们可以使用mapping作为关联列表,搜索与父String关联的子(String, String)

然后,我们使用toTree'函数递归地构建树。在每个节点上,它为该节点的子节点执行对mapping关联列表的查找。如果有子级,它会递归地将它们添加到树中。以这种方式构建树意味着输入元组可以是任意顺序的。这里对Listlookup功能的使用效率很低,如果考虑到性能,可以使用Data.Map来代替。

打印树

您的方法使用递归,这可能是实现这一点的最简单方法,但是您仍然尝试收集所有输出的列表,然后在最后对其执行mapM。我认为,在遍历树时简单地输出节点内容比较容易,除非有一些原因不这样做(如果想收集输出,可以使用Writermonad而不是IO)。

我的方法使用一个Int计数器来跟踪缩进量:

1
2
3
4
5
6
7
8
9
10
11
12
13
printTree :: Tree String -> IO ()
printTree t = printTree' t 0
  where
    -- if we reached the bottom of the tree, do nothing
    printTree' EmptyTree _ = return ()

    -- We first print the current node's string value, and then recursively
    -- call ourselves for the children. This is a simple depth-first tree
    -- traversal, for which binary trees are well-suited.
    printTree' (Node s l r) depth = do
      putStrLn $ replicate depth ' ' ++ s
      printTree' l (depth + 2)
      printTree' r (depth + 2)

输出格式良好:

1
2
3
4
5
6
7
8
9
10
11
Bob
  Dylan
    Cole
      Patrick
      Patricia
    Sarah
      David
      Fiona
  Susan
    Michael
    Madeline

另一种选择

我怀疑这是一个家庭作业问题或类似的问题,这可能使得使用二叉树是不可协商的,但是在这里很容易进行深度优先遍历相邻列表,而不必将其转换为二叉树(算法看起来很相似):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
simpleTreePrint :: [[String]] -> IO ()
simpleTreePrint list = p' (fst $ head mapping) 0
  where
    -- this recursive function prints the 'root' name (eg"Bob") that it is
    -- called with, then recursively calls itself for all the children of
    -- that name that it finds in the 'mapping' data structure
    p' :: String -> Int -> IO ()
    p' root depth = let children = maybe [] id $ lookup root mapping
                    in  do putStrLn $ replicate depth ' ' ++ root
                           forM_ children $ \c -> p' c (depth + 2)

    -- to make child lookups easier, we convert the original list of lists
    -- of names into tuples whose first values are the 'parent' name, and
    -- whose second values are the remaining names. This allows us to use the
    -- regular List lookup function, which is not efficient but may suffice
    -- for this application
    mapping :: [(String, [String])]
    mapping = fmap (\(p:cs) -> (p, cs)) list

这种方法将您的输入数据集更像是一个广义图。它可以处理有两个以上孩子的家长,并且有了更复杂的输入数据集,我们可以利用图形方法来做更酷的事情。不过,我们的simpleTreePrint函数可能会中断,因为它只在输入数据严格为树时才真正起作用。


考虑问题的更一般版本可能会更简单。也就是说,考虑一个[(a, Maybe a, Maybe a)]类型的列表。通过将每个第一个元素作为节点,您可以从此列表中构建树(更具体地说,是树列表),第二和第三个元素对应于分支-如果它们是Nothing,则分支是Nil。否则,该函数的语义与您要编写的语义完全对应。

首先,编写一个用于编码此逻辑的助手函数:

1
2
3
4
lookupDef :: Eq a => Maybe a -> [(a, Tree a)] -> Tree a
lookupDef Nothing   _ = Nil
lookupDef (Just a) xs | Just r <- lookup a xs = r
                      | otherwise             = Node a Nil Nil

第二个参数是与其余名称的树对应的(键、值)对的现有列表。然后,如果要查找的值不是空的,而是键中的一个,那么它是一个"终端"值,因此只需在树中单独返回它。

然后,构造上述(键、值)对列表的中间函数。

1
2
3
4
readTreeList :: Eq a => [(a, Maybe a, Maybe a)] -> [(a, Tree a)]
readTreeList [] = []
readTreeList xs@(_:_) = result where
 result = [ (p, Node p ? ?) | (p, l, r) <- xs ]

上面应该很明显:输入列表中的每个键都必须对应于输出中的一个值。该键的树将是一个Node p q r,其中q/r是对应于l/r的树。上面的第一个函数将计算q和r。这是lookupDef函数的来源:

1
 result = [ (p, Node p (lookupDef l ?) (lookupDef r ?)) | (p, l, r) <- xs ]

但是要在其中查找子树的列表是什么?我们只有一个这样的列表是result,所以让我们尝试一下:

1
2
 result = [ (p, Node p (lookupDef l result) (lookupDef r result))
          | (p, l, r) <- xs ]

由于懒惰的魔力,这实际上是可行的。

然后,要从中获得一个树,只需获取结果的第一个元素(您的示例输入指示应将第一个元素用作根元素)。实际上,您可以将此部分与上面的内容内联:

1
2
3
4
5
readTree :: Eq a => [(a, Maybe a, Maybe a)] -> Tree a
readTree [] = Nil
readTree xs@(_:_) = snd h where
  result@(h:_) = [ (p, Node p (lookupDef l result) (lookupDef r result))
                 | (p, l, r) <- xs ]

然后你的数据:

1
2
3
4
5
6
test = map (\([x,y,z]) -> (x, Just y, Just z))
  [["Dylan","Cole","Sarah"],
   ["Sarah","David","Fiona"],
   ["Bob","Dylan","Susan"],
   ["Cole","Patrick","Patricia"],
   ["Susan","Michael","Madeline"]]

结果是:

1
2
3
4
5
6
7
8
9
10
11
12
>printTree $ readTree test
"Bob"
  |"Dylan"
  |  |"Cole"
  |  |  |"Patrick"
  |  |  |"Patricia"
  |  |"Sarah"
  |  |  |"David"
  |  |  |"Fiona"
  |"Susan"
  |  |"Michael"
  |  |"Madeline"

如果数据类型不是存储键值对(data.map)的列表,这当然会更快,但这是另一个问题。

请注意,我稍微修改了/添加了定义,但这与上述代码无关:

1
2
3
4
5
6
{-# LANGUAGE DeriveFunctor, DeriveFoldable #-}

import qualified Data.Foldable as F

data Tree a = Nil | Node a (Tree a) (Tree a)
  deriving (Show, Read, Eq, Functor, F.Foldable)

这给了你fmaptoList

1
2
3
4
5
6
formatTree Nil = Nil
formatTree (Node a l r) = Node (show a)
                               (fmap ("  |" ++) $ formatTree l)
                               (fmap ("  |" ++) $ formatTree r)

printTree x = putStrLn . unlines . F.toList . formatTree $ x

这为您提供了一个更简单的漂亮打印功能。


让我们为子-父关联列表提供一个类型:

1
2
3
4
type Parents = [ [String] ]

theParents :: Parents
theParents = [ ["Bob","Dylan","Susan"], ["Dylan","Cole","Sarah"], ... ]

您首先必须编写一个函数来查找此列表中的数据:

1
2
lookupParents :: Parents -> String -> (Maybe String, Maybe String)
lookupParents pars name = ...???...

例如。:

1
2
lookupParents theParents"Bob" = (Just"Dylan", Just"Susan")
lookupParents theParents"nobody" = (Nothing, Nothing)

接下来,您的buildTree函数将如下所示:

1
2
3
4
5
buildTree :: Parents -> String -> Tree String
buildTree parents rootName = Node rootName leftTree rightTree
  where (mleft, mright) = lookupParents parents rootName
        leftTree = ... some function of mleft ...
        rightTree = ... some function of mright ...