The Wayback Machine - https://web.archive.org/web/20200601185623/https://github.com/topics/data-engineering
Skip to content
#

data-engineering

Here are 405 public repositories matching this topic...

lauralorenz
lauralorenz commented Apr 6, 2020

Use Case

Please provide a use case to help us understand your request in context
The Kubernetes Job tasks in our task library mimic the Kubernetes API, but an expected 'normal' use case of them is composed of several steps, namely creating a namespaced job, polling for it to complete, and deleting the job at the end. Right now no task in the task library knows how to poll for job status, an

ericmjl
ericmjl commented Mar 12, 2020

janitor.biology could do with a to_fasta function, I think. The intent here would be to conveniently export a dataframe of sequences as a FASTA file, using one column as the fasta header.

strawman implementation below:

import pandas_flavor as pf
from Bio.SeqRecord import SeqRecord
from Bio.Seq import Seq
from Bio import SeqIO

@pf.register_dataframe_method
def to_fasta(d
BenBirt
BenBirt commented Aug 13, 2019

We should add some stuff to contributors.md. Something like:

  • when opening a PR, feel free to immediately request a review, probably from @BenBirt or @lewish
  • one reviewer is fine, add two or more though if you want to get something in faster / want more eyes reviewing
  • after resolving a round of PR comments, hit the "re-request review" button
  • once the PR is approved & you have resolved any
JNKHunter
JNKHunter commented May 25, 2017

Ubuntu 16.04, Ansible 2.3.0
As per the readme, a directory should be created at /etc/ansible/hosts. However, the default Ansible inventory location is /etc/ansible/hosts. This means the default inventory location specified in /etc/ansible/ansible.cfg must be changed to some other location. It would be good to specify it in the readme to not confuse Ansible newcomers. Thanks!

jameslamb
jameslamb commented Oct 24, 2019

Currently, there are some examples in README.md with Elasticsearch queries and corresponding uptasticsearch code. That code is effectively pseudocode right now, as it references a fictional Elasticsearch cluster.

I think this could lead to a bad experience with the docs and lead people to walk away from the project and not come back.

I would love if someone changed those examples to be r

Improve this page

Add a description, image, and links to the data-engineering topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the data-engineering topic, visit your repo's landing page and select "manage topics."

Learn more

You can’t perform that action at this time.
Morty Proxy This is a proxified and sanitized view of the page, visit original site.