Check for URL in a String - Python
Last Updated :
24 Dec, 2025
We are given a string that may contain one or more URLs and our task is to extract them efficiently. This is useful for web scraping, text processing, and data validation.
For Example:
Input: s = "My Profile: https://www.geeksforgeeks.org/404.html/ in the portal of https://www.geeksforgeeks.org/"
Output: ['https://www.geeksforgeeks.org/404.html/', 'https://www.geeksforgeeks.org/']
Ouput is a list containing all the URLs.
Below are the several methods to perform this task:
Using re.findall()
re.findall() function in Python is used to find all occurrences of a pattern in a given string and return them as a list.
Python
import re
s = "My Profile: https://www.geeksforgeeks.org/404.html/ in the portal of https://www.geeksforgeeks.org/"
pattern = r'https?://\S+|www\.\S+'
print("URLs:", re.findall(pattern, s))
OutputURLs: ['https://www.geeksforgeeks.org/404.html/', 'https://www.geeksforgeeks.org/']
Explanation:
- https?://\S+ matches URLs starting with http:// or https://.
- www\.\S+ matches URLs starting with www.
- findall(): returns all matches in a list.
Using the urlparse()
urlparse() function from urllib.parse breaks down a URL into components like scheme, domain, path, and query.
Python
from urllib.parse import urlparse
s = 'My Profile: https://www.geeksforgeeks.org/404.html/ in the portal of https://www.geeksforgeeks.org/'
s1= s.split()
urls = []
for word in s1:
parsed = urlparse(word)
if parsed.scheme and parsed.netloc:
urls.append(word)
print("URLs:", urls)
OutputURLs: ['https://www.geeksforgeeks.org/404.html/', 'https://www.geeksforgeeks.org/']
Explanation:
- s.split(): function splits the string to words.
- urlparse(word): function checks each word to see if it has a valid scheme (http/https) and domain.
- URLs are added to url list using append() function.
urlextract is a third-party Python library used to easily extract URLs from text without writing complex regular expressions. Use the following pip command to install it:
pip install urlextract
Python
from urlextract import URLExtract
extractor = URLExtract()
urls = extractor.find_urls(s)
print("URLs:", urls)
Output
['https://www.geeksforgeeks.org/404.html/', 'https://www.geeksforgeeks.org/']
Explanation:
- URLExtract(): creates an extractor object to scan the string.
- find_urls(): detects all URLs in s and returns them as a list, no manual splitting or validation is needed.
Using startswith()
This approach splits the text into words and checks each word using the built-in startswith() method to see if it begins with "http://" or "https://". Matching words are treated as URLs and collected.
Python
s = 'My Profile: https://www.geeksforgeeks.org/404.html/ in the portal of https://www.geeksforgeeks.org/'
x = s.split()
res=[]
for i in x:
if i.startswith("https:") or i.startswith("http:"):
res.append(i)
print("Urls: ", res)
OutputUrls: ['https://www.geeksforgeeks.org/404.html/', 'https://www.geeksforgeeks.org/']
Explanation:
- string.split(): splits the string into words.
- i.startswith("https:") or i.startswith("http:"): Checks whether the word begins with a URL scheme.
- res.append(i): Adds the word to the list if it is a URL.
Using find() method
find() is a built-in method in Python that is used to find a specific element in a collection, so we can use it to identify and extract a URL from a string.
Python
s = 'My Profile: https://www.geeksforgeeks.org/404.html/ in the portal of https://www.geeksforgeeks.org/'
s1 = s.split()
res=[]
for i in s1:
if i.find("https:")==0 or i.find("http:")==0:
res.append(i)
print("Urls: ", res)
OutputUrls: ['https://www.geeksforgeeks.org/404.html/', 'https://www.geeksforgeeks.org/']
Explanation:
- s.split(): funtion splits the string to words.
- i.find("https:") == 0 or i.find("http:") == 0: Checks if the word starts with "https:" or "http:".
Related Articles:
Explore
Python Fundamentals
Python Data Structures
Advanced Python
Data Science with Python
Web Development with Python
Python Practice