The following table provides median times it takes to decode a string to a tree for html parsers that can be used from Elixir. Benchmarks were conducted on a machine with an `AMD Ryzen 9 3950X (32) @ 3.500GHz` CPU and 32GB of RAM. The `mix fast_html.bench` task can be used for running the benchmark by yourself.
Full benchmark output can be seen in [this snippet](https://git.pleroma.social/pleroma/elixir-libraries/fast_html/snippets/3128)
1. Myhtmlex has a C-Node mode, but it wasn't benchmarked here because it segfaults on `document-large.html`
2. The slowdown on `fragment-small.html` is due to Port overhead. Unlike html5ever and Myhtmlex in NIF mode, `fast_html` has the parser process isolated and communicates with it over stdio, so even if a fatal crash in the parser happens, it won't bring down the entire VM.
## Contribution / Bug Reports
* Please make sure you do `git submodule update` after a checkout/pull
- <a href="http://mac.github.com" data-url="github-mac://openRepo/https://github.com/rgrove/sanitize" class="minibutton sidebar-button js-conduit-rewrite-url" title="Save rgrove/sanitize to your computer and use it in GitHub Desktop." aria-label="Save rgrove/sanitize to your computer and use it in GitHub Desktop.">
- <a href="/rgrove/sanitize/commit/2e6c581fa92602e899407f018feb0320c5d130be" class="message" data-pjax="true" title="Add a couple of legacy attributes to the relaxed config.">Add a couple of legacy attributes to the relaxed config.</a>
- <a href="/rgrove/sanitize/commit/ce844b7eb13bfee84276d41ba91ff183773f484b" class="message" data-pjax="true" title="Update benchmarks. We got a lot faster. Thanks Gumbo!">Update benchmarks. We got a lot faster. Thanks Gumbo!</a>
- <a href="/rgrove/sanitize/commit/2e6c581fa92602e899407f018feb0320c5d130be" class="message" data-pjax="true" title="Add a couple of legacy attributes to the relaxed config.">Add a couple of legacy attributes to the relaxed config.</a>
- <a href="/rgrove/sanitize/commit/5f2809e5e13341ff163d90f78981d729bfb00a58" class="message" data-pjax="true" title="Workaround for libxml2 forcibly adding a content-type meta tag.
-
-The version of libxml2 used by Nokogiri forcibly adds a content-type meta
-tag to all documents with a <head> element during serialization, which is
-stupid.
-
-See also: sparklemotion/nokogiri#1008">Workaround for libxml2 forcibly adding a content-type meta tag.</a>
- <a href="/rgrove/sanitize/commit/21cece27a377d40b405fc54bdf942f8eecfb5008" class="message" data-pjax="true" title="Add .yardopts, and use yard to generate docs.">Add .yardopts, and use yard to generate docs.</a>
- <a href="/rgrove/sanitize/commit/2ca27b786f5acbd48d7905204ff9a5410997eded" class="message" data-pjax="true" title="Travis: Test against Ruby 2.1.2.">Travis: Test against Ruby 2.1.2.</a>
- <a href="/rgrove/sanitize/commit/e28fc3ec6ea1db83de0c8dbaf55c08e7f72b4183" class="message" data-pjax="true" title="Include HISTORY.md in the docs.">Include HISTORY.md in the docs.</a>
- <a href="/rgrove/sanitize/commit/5f2809e5e13341ff163d90f78981d729bfb00a58" class="message" data-pjax="true" title="Workaround for libxml2 forcibly adding a content-type meta tag.
-
-The version of libxml2 used by Nokogiri forcibly adds a content-type meta
-tag to all documents with a <head> element during serialization, which is
-stupid.
-
-See also: sparklemotion/nokogiri#1008">Workaround for libxml2 forcibly adding a content-type meta tag.</a>
-<p>You can also start with one of Sanitize's built-in configurations and then
-customize it to meet your needs.</p>
-
-<p>The built-in configs are deeply frozen to prevent people from modifying them
-(either accidentally or maliciously). To customize a built-in config, create a
-new copy using <code>Sanitize::Config.merge()</code>, like so:</p>
-
-<div class="highlight highlight-ruby"><pre><span class="c1"># Create a customized copy of the Basic config, adding <div> and <table> to the</span>