This ticket is a follow-up to ticket-create-sitemap-generator-script.md. The initial implementation was completed correctly according to the original specification. However, the specification was flawed due to an oversight in identifying the correct data sources for our content.
The goal of this ticket is to refactor the existing script to use the correct source of truth, making it robust and accurate for generating our sitemap.xml and llm.txt files.
The correct data source is:
learn/tree.json: This is the single source of truth for ALL internal content, including documentation, guides, tutorials, and internally-hosted blog posts.
The apps/portal/resources/data/blog.json file is for presentation purposes on the website and should be ignored for sitemap generation. The approach of scanning the learn/blog directory is also incorrect.
Acceptance Criteria
- Rename the script from
buildScripts/sitemap.mjs to buildScripts/generate-seo-files.mjs.
- Refactor the script to ensure it uses
learn/tree.json as the single source of truth for all internal URLs.
- Remove any logic that reads from
apps/portal/resources/data/blog.json or scans the learn/blog directory.
- The script should export a primary function, e.g.,
getContentUrls({baseUrl}), that returns a clean, absolute array of all internal site URLs.
- Ensure that URL path segments are joined correctly using forward slashes (
/).
- Update the
generateSitemap() and generateLlmTxt() functions to use this corrected data source.
This ticket is a follow-up to
ticket-create-sitemap-generator-script.md. The initial implementation was completed correctly according to the original specification. However, the specification was flawed due to an oversight in identifying the correct data sources for our content.The goal of this ticket is to refactor the existing script to use the correct source of truth, making it robust and accurate for generating our
sitemap.xmlandllm.txtfiles.The correct data source is:
learn/tree.json: This is the single source of truth for ALL internal content, including documentation, guides, tutorials, and internally-hosted blog posts.The
apps/portal/resources/data/blog.jsonfile is for presentation purposes on the website and should be ignored for sitemap generation. The approach of scanning thelearn/blogdirectory is also incorrect.Acceptance Criteria
buildScripts/sitemap.mjstobuildScripts/generate-seo-files.mjs.learn/tree.jsonas the single source of truth for all internal URLs.apps/portal/resources/data/blog.jsonor scans thelearn/blogdirectory.getContentUrls({baseUrl}), that returns a clean, absolute array of all internal site URLs./).generateSitemap()andgenerateLlmTxt()functions to use this corrected data source.