在没有库函数的情况下查找字符串是否是Sml中另一个字符串的子字符串

Question

我正在尝试编写一个函数subString：string * string - > int，它检查第一个字符串是否是第二个字符串的子字符串及其区分大小写。

如果第一个字符串是子字符串，我想从0开始返回索引，如果不是，则返回-1。如果多次出现则只返回第一次出现的索引。

例如：

subString("bc","abcabc") ===>1
subString("aaa","aaaa") ===>0
subString("bc","ABC") ===>-1

由于我不太熟悉sml或在sml中使用字符串而且我不应该使用任何内置函数（如String.sub），因此我很难绕过这个问题。

我可以使用辅助函数。

我能想到的只是在辅助函数中以某种方式使用爆炸并以某种方式检查列表然后内爆它们，但是如何获得索引位置？

我只有

fun subString(s1,s2) =
     if null s2 then ~1
     else if s1 = s2 then 0
     else 1+subString(s1, tl s2);

我正在考虑使用一个帮助函数来爆炸字符串，然后可能比较两个，但我无法想象如何让它工作。

Answer 1

我不应该使用像String.sub这样的内置函数

太遗憾了！由于字符串具有抽象接口，而列表可以直接访问其主要构造函数[]和::，因此必须使用库函数来获取字符串。 explode也是一个图书馆功能。但好吧，如果你的约束是你必须将你的字符串转换成一个列表来解决练习，那就这样吧。

鉴于您当前的代码，

fun subString(s1,s2) =
     if null s2 then ~1
     else if s1 = s2 then 0
     else 1+subString(s1, tl s2);

我在这里感觉到一个问题：

   subString ([#"b",#"c"], [#"a",#"b",#"c",#"d"])
~> if null ([#"a",#"b",#"c",#"d"]) then ... else
   if [#"b",#"c"] = [#"a",#"b",#"c",#"d"] then ... else
   1 + subString([#"b",#"c"], [#"b",#"c",#"d"])

~> 1 + subString([#"b",#"c"], [#"b",#"c",#"d"])
~> 1 + if null ([#"b",#"c",#"d"]) then ... else
       if [#"b",#"c"] = [#"b",#"c",#"d"] then ... else
       1 + subString([#"b",#"c"], [#"c",#"d"])

似乎检查s1 = s2是不够的：我们应该喜欢说[#"b",#"c"]是[#"b",#"c",#"d"]的子串，因为它是它的前缀，而不是因为它是等价的。使用s1 = s2，您最终会检查某些内容是有效的后缀，而不是有效的子字符串。所以你需要将s1 = s2变成更聪明的东西。

也许你可以构建一个帮助函数来确定一个列表是否是另一个列表的前缀并在此处使用它？

至于通过explodeing你的字符串到列表来解决这个练习：这是非常低效的，以至于标准ML的姊妹语言Ocaml有来自图书馆的explode entirely removed：

函数explode和implode是旧版本的Caml，但是我们从OCaml中省略了它们，因为它们鼓励低效的代码。将字符串视为字符列表通常是一个坏主意，将其视为字符数组更适合实际实现。

首先，String.isSubstring already exists，所以这是一个解决的问题。但如果不是，并且有人想要写这个，并且String.sub没有作弊（它正在访问字符串中的字符，相当于通过x::xs匹配列表的头部和尾部的模式），那么让我鼓励你要编写高效，可组合和功能的代码：

(* Check that a predicate holds for all (c, i) of s, where
 * s is a string, c is every character in that string, and
 * i is the position of c in s. *)
fun alli s p =
    let val stop = String.size s
        fun go i = i = stop orelse p (String.sub (s, i), i) andalso go (i + 1)
    in go 0 end

(* needle is a prefix of haystack from the start'th index *)
fun isPrefixFrom (needle, haystack, start) =
    String.size needle + start <= String.size haystack andalso
    alli needle (fn (c, i) => String.sub (haystack, i + start) = c)

(* needle is a prefix of haystack if it is from the 0th index *)
fun isPrefix (needle, haystack) =
    isPrefixFrom (needle, haystack, 0)

(* needle is a substring of haystack if is a prefix from any index *)
fun isSubstring (needle, haystack) =
    let fun go i =
            String.size needle + i <= String.size haystack andalso
            (isPrefixFrom (needle, haystack, i) orelse go (i + 1))
    in go 0 end

在构建使用列表递归而不是字符串索引递归的isSubstring时可以重复使用的一般思路是抽象地构建算法：needle是haystack的子字符串，可以用needle作为前缀的简单术语来定义。 haystack从haystack的任何有效位置算起（当然不超过haystack）。并确定某些内容是否是前缀更容易，使用列表递归更容易！

这个建议会给你一个模板，

fun isPrefix ([], _) = ...
  | isPrefix (_, []) = ...
  | isPrefix (x::xs, y::ys) = ...

fun isSubstring ([], _) = ...
  | isSubstring (xs, ys) = ... isPrefix ... orelse ...

至于优化字符串索引递归解决方案，你可以避免isPrefixFrom和isSubstring中的双边界检查，使isPrefixFrom只能访问isPrefix和isSubstring的本地函数;否则会不安全。

测试这个，

- isSubstring ("bc", "bc");
> val it = true : bool
- isSubstring ("bc", "bcd");
> val it = true : bool
- isSubstring ("bc", "abc");
> val it = true : bool
- isSubstring ("bc", "abcd");
> val it = true : bool
- isSubstring ("bc", "");
> val it = false : bool

Answer 2

这已经是一个非常好的开始，但有一些小问题：

在递归的情况下，即使递归应用程序没有找到子字符串并返回-1，也会向递归结果添加1。在添加1之前，您应该检查结果是否为-1。

在第二行中，检查两个字符串是否相等。如果这样做，只有当字符串以该子字符串结尾时才会找到子字符串。所以你在第2行真正想做的是测试s2是否以s1开头。我建议你编写一个执行该测试的辅助函数。对于这个辅助函数，您确实可以使用explode，然后递归检查列表的第一个字符是否相同。一旦你有了这个辅助函数，就在第2行使用它而不是相等的测试。

在没有库函数的情况下查找字符串是否是Sml中另一个字符串的子字符串

问题描述投票：1回答：2

2个回答

最新问题

在没有库函数的情况下查找字符串是否是Sml中另一个字符串的子字符串

问题描述 投票：1回答：2

2个回答

最新问题

问题描述投票：1回答：2