Python RegEx

❮ Precedente Prossimo ❯

Una RegEx, o un'espressione regolare, è una sequenza di caratteri che forma un modello di ricerca.

RegEx può essere utilizzato per verificare se una stringa contiene il modello di ricerca specificato.

Modulo RegEx

Python ha un pacchetto integrato chiamato re, che può essere utilizzato per lavorare con le espressioni regolari.

Importa il remodulo:

import re

RegEx in Python

Dopo aver importato il remodulo, puoi iniziare a usare le espressioni regolari:

Esempio

Cerca la stringa per vedere se inizia con "The" e termina con "Spain":

import re

txt = "The rain in Spain"
x = re.search("^The.*Spain$", txt)

Funzioni RegEx

Il remodulo offre un insieme di funzioni che ci permette di cercare una stringa per una corrispondenza:

Function	Description
findall	Returns a list containing all matches
search	Returns a Match object if there is a match anywhere in the string
split	Returns a list where the string has been split at each match
sub	Replaces one or many matches with a string

Metacaratteri

I metacaratteri sono caratteri con un significato speciale:

Character	Description	Example
[]	A set of characters	"[a-m]"
\	Signals a special sequence (can also be used to escape special characters)	"\d"
.	Any character (except newline character)	"he..o"
^	Starts with	"^hello"
$	Ends with	"planet$"
*	Zero or more occurrences	"he.*o"
+	One or more occurrences	"he.+o"
?	Zero or one occurrences	"he.?o"
{}	Exactly the specified number of occurrences	"he{2}o"
\|	Either or	"falls\|stays"
()	Capture and group

Sequenze speciali

Una sequenza speciale è \seguita da uno dei caratteri nell'elenco seguente e ha un significato speciale:

Character	Description	Example
\A	Returns a match if the specified characters are at the beginning of the string	"\AThe"
\b	Returns a match where the specified characters are at the beginning or at the end of a word (the "r" in the beginning is making sure that the string is being treated as a "raw string")	r"\bain" r"ain\b"
\B	Returns a match where the specified characters are present, but NOT at the beginning (or at the end) of a word (the "r" in the beginning is making sure that the string is being treated as a "raw string")	r"\Bain" r"ain\B"
\d	Returns a match where the string contains digits (numbers from 0-9)	"\d"
\D	Returns a match where the string DOES NOT contain digits	"\D"
\s	Returns a match where the string contains a white space character	"\s"
\S	Returns a match where the string DOES NOT contain a white space character	"\S"
\w	Returns a match where the string contains any word characters (characters from a to Z, digits from 0-9, and the underscore _ character)	"\w"
\W	Returns a match where the string DOES NOT contain any word characters	"\W"
\Z	Returns a match if the specified characters are at the end of the string	"Spain\Z"

Imposta

Un set è un insieme di caratteri all'interno di una coppia di parentesi quadre []con un significato speciale:

Set	Description	Try it
[arn]	Returns a match where one of the specified characters (`a`, `r`, or `n`) are present
[a-n]	Returns a match for any lower case character, alphabetically between `a` and `n`
[^arn]	Returns a match for any character EXCEPT `a`, `r`, and `n`
[0123]	Returns a match where any of the specified digits (`0`, `1`, `2`, or `3`) are present
[0-9]	Returns a match for any digit between `0` and `9`
[0-5][0-9]	Returns a match for any two-digit numbers from `00` and `59`
[a-zA-Z]	Returns a match for any character alphabetically between `a` and `z`, lower case OR upper case
[+]	In sets, `+`, `*`, `.`, `\|`, `()`, `$`,`{}` has no special meaning, so `[+]` means: return a match for any `+` character in the string

La funzione trova tutto()

La findall()funzione restituisce un elenco contenente tutte le corrispondenze.

Esempio

Stampa un elenco di tutte le corrispondenze:

import re

txt = "The rain in Spain"
x = re.findall("ai", txt)
print(x)

L'elenco contiene le corrispondenze nell'ordine in cui sono state trovate.

Se non vengono trovate corrispondenze, viene restituito un elenco vuoto:

Esempio

Restituisce una lista vuota se non è stata trovata alcuna corrispondenza:

import re

txt = "The rain in Spain"
x = re.findall("Portugal", txt)
print(x)

La funzione search()

La search()funzione cerca una corrispondenza nella stringa e restituisce un oggetto Match se esiste una corrispondenza.

Se sono presenti più corrispondenze, verrà restituita solo la prima occorrenza della corrispondenza:

Esempio

Cerca il primo carattere di spazio vuoto nella stringa:

import re

txt = "The rain in Spain"
x = re.search("\s", txt)

print("The first white-space character is located in position:", x.start())

Se non vengono trovate corrispondenze, Noneviene restituito il valore:

Esempio

Effettua una ricerca che non restituisce corrispondenze:

import re

txt = "The rain in Spain"
x = re.search("Portugal", txt)
print(x)

La funzione split()

La split()funzione restituisce un elenco in cui la stringa è stata divisa ad ogni corrispondenza:

Esempio

Dividi ad ogni carattere di spazio vuoto:

import re

txt = "The rain in Spain"
x = re.split("\s", txt)
print(x)

È possibile controllare il numero di occorrenze specificando il maxsplit parametro:

Esempio

Dividi la stringa solo alla prima occorrenza:

import re

txt = "The rain in Spain"
x = re.split("\s", txt, 1)
print(x)

La funzione sub()

La sub()funzione sostituisce le corrispondenze con il testo a tua scelta:

Esempio

Sostituisci ogni carattere di spazio vuoto con il numero 9:

import re

txt = "The rain in Spain"
x = re.sub("\s", "9", txt)
print(x)

È possibile controllare il numero di sostituzioni specificando il count parametro:

Esempio

Sostituisci le prime 2 occorrenze:

import re

txt = "The rain in Spain"
x = re.sub("\s", "9", txt, 2)
print(x)

Abbina oggetto

Un oggetto Match è un oggetto contenente informazioni sulla ricerca e sul risultato.

Nota: se non c'è corrispondenza, Noneverrà restituito il valore, invece dell'oggetto Match.

Esempio

Fai una ricerca che restituirà un oggetto Match:

import re

txt = "The rain in Spain"
x = re.search("ai", txt)
print(x) #this will print an object

L'oggetto Match ha proprietà e metodi utilizzati per recuperare informazioni sulla ricerca e il risultato:

.span()restituisce una tupla contenente le posizioni di inizio e fine della corrispondenza.
.stringrestituisce la stringa passata nella funzione
.group()restituisce la parte della stringa in cui c'era una corrispondenza

Esempio

Stampa la posizione (posizione iniziale e finale) della prima occorrenza di corrispondenza.

L'espressione regolare cerca tutte le parole che iniziano con una "S" maiuscola:

import re

txt = "The rain in Spain"
x = re.search(r"\bS\w+", txt)
print(x.span())

Esempio

Stampa la stringa passata nella funzione:

import re

txt = "The rain in Spain"
x = re.search(r"\bS\w+", txt)
print(x.string)

Esempio

Stampa la parte della stringa in cui c'era una corrispondenza.

L'espressione regolare cerca tutte le parole che iniziano con una "S" maiuscola:

import re

txt = "The rain in Spain"
x = re.search(r"\bS\w+", txt)
print(x.group())

Nota: se non c'è corrispondenza, Noneverrà restituito il valore, invece dell'oggetto Match.

❮ Precedente Prossimo ❯

Esercitazione Python

Gestione dei file

Moduli Python

Python Matplotlib

Apprendimento automatico

Python MySQL

Python MongoDB

Riferimento Python

Riferimento del modulo

Python come fare per

Esempi Python

Python RegEx

Modulo RegEx

RegEx in Python

Esempio

Funzioni RegEx

Metacaratteri

Sequenze speciali

Imposta

La funzione trova tutto()

Esempio

Esempio

La funzione search()

Esempio

Esempio

La funzione split()

Esempio

Esempio

La funzione sub()

Esempio

Esempio

Abbina oggetto

Esempio

Esempio

Esempio

Esempio