Upload Your First Document
Add a clinical source and understand what MARCUS does with it after upload.
Uploading is the point where a normal document becomes usable inside MARCUS. The file itself is only the beginning. After upload, MARCUS parses it, splits it into passages, generates retrieval representations, and prepares downstream metadata such as summaries, concepts, and authority cues.
The quality of your uploads strongly influences the quality of your answers. A clean, current protocol usually behaves well. A blurry scan of an outdated draft usually does not.
Before You Upload
Take one minute to check the document first. That minute saves much more time later.
Ask:
- Is this the final approved version, or just a draft?
- Is the text selectable, or is it only a scanned image?
- Does this document belong in this project's scope?
- Would I want MARCUS to cite this document in an answer?
- If this source conflicts with another one, do I understand which should carry more weight?
If the answer to the last two questions is "no," do not upload it yet.
File Types You Will Most Commonly Use
The current upload flow accepts these common formats:
- DOCX
- TXT
- HTML
In some environments, additional office formats may appear, but the safest routine is to assume the list above first.
Which Files Usually Work Best
| File type or condition | Usually works well? | Notes |
|---|---|---|
| Text-based PDF | Yes | Often the best default for finalized protocols and guidelines |
| DOCX | Yes | Useful when the source exists only as a working document or policy draft |
| Plain TXT | Usually | Good for structured internal notes, but often lacks layout clues |
| HTML | Usually | Useful for exported internal guidance pages |
| Scanned image PDF | Mixed | May parse poorly if the text is not machine-readable |
| Duplicate versions of the same source | Poor practice | Creates retrieval redundancy and confusion |
Step By Step
- Open your project.
- Select Add Sources or Upload Files.
- Choose the file or drag it into the upload area.
- Confirm the upload if MARCUS asks for confirmation.
- Wait for the source to move through ingestion states.
- Do not judge answer quality until the source is fully indexed.
What MARCUS Does After Upload
The ingestion pipeline runs in stages. You do not need to memorize the technical details, but it helps to know the broad flow:
- Source registration: MARCUS creates a source record and stores file metadata.
- Parsing: The file content is extracted into usable text.
- Chunking: The text is broken into smaller passages so retrieval can target the most relevant sections.
- Embedding: MARCUS converts each chunk into a searchable representation.
- Persistence: Chunk records and source metadata are stored for future retrieval.
- Enrichment: Additional processing may create summaries, key points, tags, concepts, authority signals, and related knowledge assets.
You can think of this as turning one long document into a set of searchable evidence units plus a descriptive wrapper that helps humans interpret the source later.
How To Know It Is Done
Watch the source status in the project source list or upload tray.
| Status | What it means in plain language | What you should do |
|---|---|---|
| Queued | MARCUS accepted the file and is waiting to process it | Nothing yet; just wait |
| Processing | The document is being parsed, chunked, embedded, or enriched | Avoid judging chat behavior until this finishes |
| Indexed / Ready | The source is available for retrieval | Safe to test in chat and review the briefing |
| Failed | The workflow stopped before completion | Inspect the file quality, try again, or replace the source |
What To Expect Immediately After Upload
- The source may appear in the list before its briefing is ready.
- The title and basic metadata may show up before the full enrichment finishes.
- Large or complex files may take longer.
- Chat quality improves after indexing completes; before that, the source may not be retrievable at all.
This is normal. "Visible in the list" and "fully usable in retrieval" are not always the same moment.
How To Confirm The Upload Worked Well
After the status reaches indexed or ready:
- Open the source entry.
- Check that the title and document type look reasonable.
- Read the summary or briefing if present.
- Ask a narrow question that should clearly be answerable from this document.
- Confirm that the citation points back to the expected source.
If the summary looks nonsensical or the chat answer never uses the new source, the problem is often file quality, project scope, or indexing status rather than a random model failure.
A Safe First Upload Strategy
For the first project, upload in this order:
- One current, high-trust protocol or policy
- One supporting reference or pathway
- One document that answers a slightly different but related question
Then test the project before adding more. This lets you see whether the retrieval behavior is clean before the corpus becomes harder to manage.
Good Upload Hygiene
- Upload the final approved version of a protocol, not multiple drafts.
- Prefer text PDFs over scans when possible.
- Keep policies, protocols, lecture notes, and reference documents distinguishable.
- Avoid unnecessary duplicates.
- If a document is old but still important for context, label it clearly in your own workflow and interpret it carefully later.
Common Upload Problems
| Problem | What it usually means | First thing to check |
|---|---|---|
| The source stays in processing for a long time | The file may be large, complex, or waiting on background work | Wait briefly, then refresh and confirm the status changed |
| The source fails | The document may be malformed or parse poorly | Try a cleaner export or a different file version |
| The summary looks wrong | The text extraction may have been noisy or the document may be unusually structured | Open the source and inspect the original file quality |
| The source never appears in answers | It may not be indexed yet, may not match the question, or may not belong in the project | Check status, project scope, and question wording |
| MARCUS keeps citing an older version | Multiple versions may be competing | Remove or isolate duplicate drafts if possible |
When To Re-upload Versus When To Fix The Project
Re-upload when:
- The file is corrupted, unreadable, or obviously parsed poorly
- You accidentally uploaded the wrong version
- The document was replaced by a newer official version
Fix the project instead when:
- The source is valid but does not belong with the other documents
- The project contains too many unrelated topics
- The answer quality problem appears across many sources, not just one file
One Reliable Habit
After every important upload, ask one question that the new source should answer clearly. If MARCUS cannot retrieve it after indexing, investigate immediately while the corpus is still small and easy to fix.