Returns true when this loader is capable of handling source.
For string sources the check is purely extension-based. For Buffer
sources the loader may inspect magic bytes when relevant.
Absolute file path or raw bytes.
Parses source and returns a normalised LoadedDocument.
When source is a string the loader treats it as an absolute (or
resolvable) file path and reads the file from disk. When source is a
Buffer the loader parses the bytes directly and derives as much
metadata as possible from the buffer content alone.
Absolute file path OR raw document bytes.
Optional _options: LoadOptionsOptional hints such as a format override.
A promise resolving to the fully-populated LoadedDocument.
When the file cannot be read or the format is not parsable.
Readonly supportedFile extensions this loader handles, each with a leading dot.
Used by LoaderRegistry to route file paths to the correct loader.
['.md', '.mdx']
Basic document loader for HTML (
.html,.htm) files.Text extraction strategy
<script>and<style>blocks are removed entirely.<p>,<div>,<h1>–<h6>, etc.) are replaced with newline characters to preserve paragraph structure.Metadata
title— extracted from the<title>element when present.wordCount— approximate count of words in the extracted text.source— absolute file path (when loaded from disk).Implements
Example