Fix Smart Quotes in Python Code

Replace curly quotes that cause SyntaxError: invalid character in your Python code. Paste your code above, get clean ASCII output instantly β€” no server, nothing sent.

Rules:Smart QuotesConverts curly quotes (β€œβ€ β€˜β€™) to straight quotes. Always active.Non-Breaking SpacesReplaces non-breaking spaces (U+00A0) with regular spaces. Always active.Line EndingsNormalizes CRLF β†’ LF and trims trailing whitespace per line. Always active.Removes invisible zero-width characters (U+200B, U+200C, U+200D) that silently break string comparisons.Strips the Byte Order Mark (U+FEFF) that causes β€œinvalid character” errors in parsers and editors.Removes soft hyphens (U+00AD) from PDFs that show as garbled characters in code editors.Converts mixed tabs/spaces to a consistent indent width. Click to pick a size.Collapses all line breaks into one continuous paragraph. Great for reflowing PDF or email text.Control blank line density. Click to pick Keep 1 or Remove all.Text never leaves your browser
0 characters

Why smart quotes break Python code

You've copied a Python snippet from a blog post, Slack message, Word document, or email. It looks perfectly fine. You paste it into your editor, run it, and get:

  File "script.py", line 3
    print(β€œHello, world!”)
          ^
SyntaxError: invalid character 'β€œ' (U+201C)

The culprit is smart quotes (also called curly quotes or typographic quotes). These are Unicode characters that word processors, email clients, and messaging apps automatically substitute for straight ASCII quotes.

Before vs. After: what your code actually looks like

Here's the problem visualized. This is what you think you pasted:

# This looks correct but won't run
message = "Hello, world!"
name = 'Alice'
print(f"{name} says: {message}")

But this is what your editor actually received (Unicode code points shown):

# Hidden curly quotes β€” causes SyntaxError
message = β€œHello, world!”       # U+201C and U+201D
name = β€˜Alice’                  # U+2018 and U+2019
print(fβ€œ{name} says: {message}”)

After cleaning with Unformat (Developer mode), you get valid Python:

# Clean ASCII quotes β€” runs correctly
message = "Hello, world!"
name = 'Alice'
print(f"{name} says: {message}")

Python expects ASCII quotes

Python's lexer only recognizes two quote characters for string delimiters:

  • Double quote: " (U+0022)
  • Single quote: ' (U+0027)

But word processors and email clients silently replace these with:

  • Left double quote: β€œ (U+201C) β€” looks like " in most fonts
  • Right double quote: ” (U+201D)
  • Left single quote: β€˜ (U+2018) β€” looks like ' in most fonts
  • Right single quote: ’ (U+2019)

These look nearly identical in most fonts, making the bug extremely hard to spot visually. The Python interpreter sees completely different characters and rejects them immediately.

Where smart quotes come from

Microsoft Word and Google Docs have β€œsmart quotes” enabled by default. Any code snippet typed or pasted into these editors gets its quotes silently replaced.

Slack, Teams, and Discord apply the same transformation outside code blocks. When a colleague pastes code into a Slack message without wrapping it in backticks, the quotes are converted to Unicode curly quotes.

Blog posts and tutorials copied from web pages frequently carry smart quotes, especially if the content was authored in a CMS like WordPress or Medium that applies typographic transformations. Even Stack Overflow answers can contain them if the author composed their answer in a word processor first.

PDF documents almost always use smart quotes. Copying code from a textbook, research paper, or documentation PDF is one of the most common sources of this issue.

The problem goes beyond quotes

Smart quotes rarely travel alone. The same sources that produce curly quotes also inject:

  • Non-breaking spaces (U+00A0) that look like regular spaces but cause Python's IndentationError: unexpected indent
  • Zero-width spaces (U+200B) that create invisible characters inside variable names, breaking NameError: name is not defined
  • Em-dashes (β€”) that replace minus signs and hyphens, breaking arithmetic expressions
  • BOM markers (U+FEFF) at the start of text that cause SyntaxError on the first line

Unformat's Developer mode catches all of these simultaneously.

How to sanitize Python code with Unformat

Use Developer mode for code cleaning. It replaces all Unicode quote variants with their ASCII equivalents and simultaneously fixes every other invisible character that breaks Python.

Developer mode handles the full set of problematic characters:

  • β€œ ” β†’ " and β€˜ ’ β†’ ' (smart quotes β†’ straight ASCII quotes)
  • U+00A0 β†’ regular space (fixes IndentationError)
  • U+200B U+200C U+200D U+FEFF β†’ removed entirely (fixes phantom characters)
  • β€” – β†’ - (em/en-dashes β†’ regular hyphens)
  • Tab characters β†’ spaces (configurable: 2 or 4 spaces, or keep tabs)
  • BOM markers β†’ removed (fixes first-line SyntaxError)
  • \r\n β†’ \n (CRLF β†’ LF line endings)

Fixing it programmatically in Python

If you need to fix smart quotes in a Python script rather than manually, here's a quick approach:

def fix_smart_quotes(text: str) -> str:
    replacements = {
        '\u201c': '"', '\u201d': '"',  # double
        '\u2018': "'", '\u2019': "'",  # single
        '\u00a0': ' ',                  # nbsp
        '\u2014': '-', '\u2013': '-',  # dashes
    }
    for old, new in replacements.items():
        text = text.replace(old, new)
    return text

But when you just need to quickly clean a pasted snippet, Unformat is faster than writing a script. Paste, clean, copy β€” done in under a second.

All processing runs in your browser. Your code never touches a server, which matters when you're working with proprietary code or sensitive data.

How to clean your text

  1. Copy the Python code that has smart quote errors.
  2. Switch to Developer mode using the toggle above the text area.
  3. Paste your code into the text area (Ctrl+V or Cmd+V).
  4. The code is instantly cleaned β€” curly quotes become straight quotes, invisible characters are removed.
  5. Verify the indentation setting (2 or 4 spaces) matches your project via the gear icon.
  6. Click "Copy Clean Text" or press Ctrl+K, then paste into your editor.
  7. Run your Python code β€” the SyntaxError should be gone.

Frequently Asked Questions

Why does Python show SyntaxError: invalid character?

Python's lexer only accepts ASCII quote characters (U+0022 for double quotes, U+0027 for single quotes) as string delimiters. When you paste code that contains Unicode curly quotes (U+201C, U+201D, U+2018, U+2019), Python sees them as invalid characters because they are not recognized quote delimiters. The error message usually includes the Unicode code point, like U+201C, which confirms the issue is a smart quote.

Which Python versions are affected by smart quotes?

All Python versions are affected, including Python 2.7, 3.x, and the latest releases. The Python lexer has never accepted Unicode curly quotes as string delimiters and likely never will, since doing so would be ambiguous (these characters are valid in Unicode string content, just not as delimiters).

Can non-breaking spaces cause IndentationError in Python?

Yes. Python uses indentation for block structure, and it expects regular ASCII spaces (U+0020) or tabs. A non-breaking space (U+00A0) looks identical to a regular space but is a different character. Python treats it as a non-whitespace character, which causes IndentationError: unexpected indent or IndentationError: unindent does not match any outer indentation level. Unformat converts all non-breaking spaces to regular spaces.

How do I prevent smart quotes when copying code from Slack?

In Slack, always share code inside code blocks (wrap with backticks ` for inline or ``` for multi-line). Slack preserves straight quotes inside code blocks but converts them to smart quotes in regular message text. If someone has already sent code without backticks, copy it and paste it into Unformat to fix the quotes.

Does this tool handle Python f-strings and triple-quoted strings?

Yes. Unformat replaces all Unicode curly quotes with ASCII straight quotes regardless of their position in the text. This correctly handles f-strings (f"..."), triple-quoted strings ("""..."""), raw strings (r"..."), and byte strings (b"..."). The tool processes at the character level, so string type prefixes are preserved.