Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Commit ba737c3

Browse filesBrowse files
gguussdpebot
authored andcommitted
Adds document text detection tutorial. (GoogleCloudPlatform#868)
* Adds document text detection tutorial. * Feedback from review * Less whitespace and fewer hanging indents
1 parent d5faacf commit ba737c3
Copy full SHA for ba737c3

File tree

Expand file treeCollapse file tree

7 files changed

+280
-0
lines changed
Filter options
Expand file treeCollapse file tree

7 files changed

+280
-0
lines changed
+1Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
output-text.jpg
+110Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
.. This file is automatically generated. Do not edit this file directly.
2+
3+
Google Cloud Vision API Python Samples
4+
===============================================================================
5+
6+
This directory contains samples for Google Cloud Vision API. `Google Cloud Vision API`_ allows developers to easily integrate vision detection features within applications, including image labeling, face and landmark detection, optical character recognition (OCR), and tagging of explicit content
7+
8+
9+
10+
11+
.. _Google Cloud Vision API: https://cloud.google.com/vision/docs
12+
13+
Setup
14+
-------------------------------------------------------------------------------
15+
16+
17+
Authentication
18+
++++++++++++++
19+
20+
Authentication is typically done through `Application Default Credentials`_,
21+
which means you do not have to change the code to authenticate as long as
22+
your environment has credentials. You have a few options for setting up
23+
authentication:
24+
25+
#. When running locally, use the `Google Cloud SDK`_
26+
27+
.. code-block:: bash
28+
29+
gcloud beta auth application-default login
30+
31+
32+
#. When running on App Engine or Compute Engine, credentials are already
33+
set-up. However, you may need to configure your Compute Engine instance
34+
with `additional scopes`_.
35+
36+
#. You can create a `Service Account key file`_. This file can be used to
37+
authenticate to Google Cloud Platform services from any environment. To use
38+
the file, set the ``GOOGLE_APPLICATION_CREDENTIALS`` environment variable to
39+
the path to the key file, for example:
40+
41+
.. code-block:: bash
42+
43+
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service_account.json
44+
45+
.. _Application Default Credentials: https://cloud.google.com/docs/authentication#getting_credentials_for_server-centric_flow
46+
.. _additional scopes: https://cloud.google.com/compute/docs/authentication#using
47+
.. _Service Account key file: https://developers.google.com/identity/protocols/OAuth2ServiceAccount#creatinganaccount
48+
49+
Install Dependencies
50+
++++++++++++++++++++
51+
52+
#. Install `pip`_ and `virtualenv`_ if you do not already have them.
53+
54+
#. Create a virtualenv. Samples are compatible with Python 2.7 and 3.4+.
55+
56+
.. code-block:: bash
57+
58+
$ virtualenv env
59+
$ source env/bin/activate
60+
61+
#. Install the dependencies needed to run the samples.
62+
63+
.. code-block:: bash
64+
65+
$ pip install -r requirements.txt
66+
67+
.. _pip: https://pip.pypa.io/
68+
.. _virtualenv: https://virtualenv.pypa.io/
69+
70+
Samples
71+
-------------------------------------------------------------------------------
72+
73+
Document Text tutorial
74+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
75+
76+
77+
78+
To run this sample:
79+
80+
.. code-block:: bash
81+
82+
$ python doctext.py
83+
84+
usage: doctext.py [-h] image_file
85+
86+
positional arguments:
87+
image_file The image for text detection.
88+
89+
optional arguments:
90+
-h, --help show this help message and exit
91+
92+
93+
94+
95+
The client library
96+
-------------------------------------------------------------------------------
97+
98+
This sample uses the `Google Cloud Client Library for Python`_.
99+
You can read the documentation for more details on API usage and use GitHub
100+
to `browse the source`_ and `report issues`_.
101+
102+
.. Google Cloud Client Library for Python:
103+
https://googlecloudplatform.github.io/google-cloud-python/
104+
.. browse the source:
105+
https://github.com/GoogleCloudPlatform/google-cloud-python
106+
.. report issues:
107+
https://github.com/GoogleCloudPlatform/google-cloud-python/issues
108+
109+
110+
.. _Google Cloud SDK: https://cloud.google.com/sdk/
+22Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
# This file is used to generate README.rst
2+
3+
product:
4+
name: Google Cloud Vision API
5+
short_name: Cloud Vision API
6+
url: https://cloud.google.com/vision/docs
7+
description: >
8+
`Google Cloud Vision API`_ allows developers to easily integrate vision
9+
detection features within applications, including image labeling, face and
10+
landmark detection, optical character recognition (OCR), and tagging of
11+
explicit content.
12+
13+
setup:
14+
- auth
15+
- install_deps
16+
17+
samples:
18+
- name: Document Text tutorial
19+
file: doctext.py
20+
show_help: True
21+
22+
cloud_client_library: true
+121Lines changed: 121 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,121 @@
1+
#!/usr/bin/env python
2+
3+
# Copyright 2017 Google Inc. All Rights Reserved.
4+
#
5+
# Licensed under the Apache License, Version 2.0 (the "License");
6+
# you may not use this file except in compliance with the License.
7+
# You may obtain a copy of the License at
8+
#
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing, software
12+
# distributed under the License is distributed on an "AS IS" BASIS,
13+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
# See the License for the specific language governing permissions and
15+
# limitations under the License.
16+
17+
"""Outlines document text given an image.
18+
19+
Example:
20+
python doctext.py resources/text_menu.jpg
21+
"""
22+
# [START full_tutorial]
23+
# [START imports]
24+
import argparse
25+
from enum import Enum
26+
import io
27+
28+
from google.cloud import vision
29+
from PIL import Image, ImageDraw
30+
# [END imports]
31+
32+
33+
class FeatureType(Enum):
34+
PAGE = 1
35+
BLOCK = 2
36+
PARA = 3
37+
WORD = 4
38+
SYMBOL = 5
39+
40+
41+
def draw_boxes(image, blocks, color):
42+
"""Draw a border around the image using the hints in the vector list."""
43+
# [START draw_blocks]
44+
draw = ImageDraw.Draw(image)
45+
46+
for block in blocks:
47+
draw.polygon([
48+
block.vertices[0].x, block.vertices[0].y,
49+
block.vertices[1].x, block.vertices[1].y,
50+
block.vertices[2].x, block.vertices[2].y,
51+
block.vertices[3].x, block.vertices[3].y], None, color)
52+
return image
53+
# [END draw_blocks]
54+
55+
56+
def get_document_bounds(image_file, feature):
57+
# [START detect_bounds]
58+
"""Returns document bounds given an image."""
59+
vision_client = vision.Client()
60+
61+
bounds = []
62+
63+
with io.open(image_file, 'rb') as image_file:
64+
content = image_file.read()
65+
66+
image = vision_client.image(content=content)
67+
document = image.detect_full_text()
68+
69+
# Collect specified feature bounds by enumerating all document features
70+
for page in document.pages:
71+
for block in page.blocks:
72+
for paragraph in block.paragraphs:
73+
for word in paragraph.words:
74+
for symbol in word.symbols:
75+
if (feature == FeatureType.SYMBOL):
76+
bounds.append(symbol.bounding_box)
77+
78+
if (feature == FeatureType.WORD):
79+
bounds.append(word.bounding_box)
80+
81+
if (feature == FeatureType.PARA):
82+
bounds.append(paragraph.bounding_box)
83+
84+
if (feature == FeatureType.BLOCK):
85+
bounds.append(block.bounding_box)
86+
87+
if (feature == FeatureType.PAGE):
88+
bounds.append(block.bounding_box)
89+
90+
return bounds
91+
# [END detect_bounds]
92+
93+
94+
def render_doc_text(filein, fileout):
95+
# [START render_doc_text]
96+
image = Image.open(filein)
97+
bounds = get_document_bounds(filein, FeatureType.PAGE)
98+
draw_boxes(image, bounds, 'blue')
99+
bounds = get_document_bounds(filein, FeatureType.PARA)
100+
draw_boxes(image, bounds, 'red')
101+
bounds = get_document_bounds(filein, FeatureType.WORD)
102+
draw_boxes(image, bounds, 'yellow')
103+
104+
if fileout is not 0:
105+
image.save(fileout)
106+
else:
107+
image.show()
108+
# [END render_doc_text]
109+
110+
111+
if __name__ == '__main__':
112+
# [START run_crop]
113+
parser = argparse.ArgumentParser()
114+
parser.add_argument('detect_file', help='The image for text detection.')
115+
parser.add_argument('-out_file', help='Optional output file', default=0)
116+
args = parser.parse_args()
117+
118+
parser = argparse.ArgumentParser()
119+
render_doc_text(args.detect_file, args.out_file)
120+
# [END run_crop]
121+
# [END full_tutorial]
+24Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# Copyright 2017 Google Inc. All Rights Reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
import os
16+
17+
import doctext
18+
19+
20+
def test_text(cloud_config, capsys):
21+
"""Checks the output image for drawing the crop hint is created."""
22+
doctext.render_doc_text('resources/text_menu.jpg', 'output-text.jpg')
23+
out, _ = capsys.readouterr()
24+
assert os.path.isfile('output-text.jpg')
+2Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
google-cloud-vision==0.23.2
2+
pillow==4.0.0
Loading

0 commit comments

Comments
0 (0)
Morty Proxy This is a proxified and sanitized view of the page, visit original site.