Chunkers and Splitters
There are several methods in the content
package to help you chunk and split text:
ChunkText
takes a text string and divides it into chunks of a specified size with a given overlap. It returns a slice of strings, where each string represents a chunk of the original text.
SplitTextWithDelimiter
splits the given text using the specified delimiter and returns a slice of strings.
SplitTextWithRegex
splits the given text using the provided regular expression delimiter. It returns a slice of strings containing the split parts of the text.
SplitMarkdownBySections
splits the given markdown text using the title sections (#, ##, etc.
) and returns a slice of strings.
SplitAsciiDocBySections
splits the given asciidoc text using the title sections (=, ==, etc.
) and returns a slice of strings.
SplitHTMLBySections
splits the given html text using the title sections (h1, h2, h3, h4, h5, h6
) and returns a slice of strings.