![]() |
|
|
Home | Donate | Forums | FAQs | Contributions | Terms, Privacy, & Copyright | Contact | Jobs | Bios |
Anonymous User (login or join us) | Upload |
Location - Based in SF
The Internet Archive is seeking a Web Wide Crawl Engineer. Our crawl engineering team is responsible for capturing and managing the highest quality content from the web. An ideal candidate demonstrates independence and initiative, is a problem solver, works well autonomously, and is technologically savvy. Additionally, the ideal candidate is open to being trained on best practices and standards around large-scale web harvests.
You will work with Web Collections Manager to design the strategy and implementation of a Web Harvesting Program using open source tools and platforms. Develop harvest techniques and tools to enable archival capture and re-rendering of rich media, streaming content, social media, as well as traditional web page content. Analyze Web collections to ensure the harvest of a representative sample, completeness and quality. Create tools and services as needed to improve the crawl through analyzing, reporting, importing data, identifying program requirements and defining technical, operational and data analysis requirements. Lead efforts to define deployment architecture and workflows. Develop tools for automated and human directed analysis and reporting of crawl material, monitor production systems using automated tools.
Your responsibilities include:
Education: Computer Science, Math BS/BA or equivalent work experience We are an equal opportunity employer. Please send your resume and cover letter to jobs at archive dot org with the subject line "Crawl Engineer". The Archive thanks all applicants for their interest, but advises that only those selected for an interview will be contacted. No phone calls please.
Open Library is seeking an experienced Python developer to join our small, experienced team. We're working towards providing a page on the web for every book ever written, and we need your help. Open Library is open, editable and freely available. We want to enhance the way data moves in and out of Open Library by building features that make it simple for people to contribute records to the library as well as extracting them. We want to connect our records to as many online resources as possible, to be the locus for information about books online.
You will be responsible for core application development (running a system called Infogami) as well as development of new website features. You will review and enhance the Open Library's current API offering, as well as looking out on to the broader web to find and develop useful API integrations back into Open Library.
Must haves:
Desirable:
We're working towards big goals at Open Library, and at its parent organization, the non-profit Internet Archive. The online presence of books is a very interesting space at the moment, ripe for an innovative outlook and wide integration with all sorts of other systems. If you enjoy breaking new ground, iterative development and huge datasets, please let us know!
About the Internet Archive The Internet Archive is a non-profit digital library committed to preserving the world's digital cultural artifacts. Used by over 6 million people, this resource is becoming part of how the Internet works. Our job is to put the best humanity has to offer within reach of students, educators and the general public. Find out more about our organization and web archive at www.archive.org.
The Internet Archive is an equal opportunity employer. We provide medical and dental benefits. Please send your resume and cover letter to jobs at archive dot org with the subject line "Open Library Engineer". The Internet Archive thanks all applicants for their interest, but advises that only those selected for an interview will be contacted. No phone calls please.
Do you want to work for an engineering-focused organization that's dedicated to providing universal access to all knowledge?
Does working on large-scale projects (petabytes of storage, massive bandwidth, millions of media items) excite you? Are you interested in working with smart people, solving interesting problems and benefiting humanity?
The Internet Archive (archive.org) is looking for an exceptional Engineer to support and extend the main archival system as well as associated projects such as nasaimages.org and openlibrary.org.
We are seeking a talented Engineer with the following qualifications and experience:
Candidate will be asked to take a short quiz to assess their qualifications.
About the Internet Archive The Internet Archive is a non-profit digital library committed to preserving the world's digital cultural artifacts. Used by over 6 million people, this resource is becoming part of how the Internet works. Our job is to put the best humanity has to offer within reach of students, educators and the general public. Find out more about our organization and web archive at www.archive.org
The Internet Archive is an equal opportunity employer. We provide medical and dental benefits. Please send your resume and cover letter to jobs at archive dot org with the subject line "Engineer-Petabox Team". The Internet Archive thanks all applicants for their interest, but advises that only those selected for an interview will be contacted. No phone calls please.
Internet Archive is the largest digital library in the world containing several million media items. We are a non-profit dedicated to gathering and preserving cultural materials and making them available to well over 1 million users every day around the world.
"Collections" at the archive refers to audio, video and text collections. Some of these items are submitted by individual users, and others come from institutions or private collectors. You might be familiar with the feature films, live music recordings, or retro ephemeral films available on our site, among many other collections. We work with interesting content every day, and we get to spend our time helping humanity.
The Collections Engineer will assist staff and collection owners with pulling more items into our collections and helping users with access issues. S/he should be familiar with doing small crawls to gather information, parsing RSS feeds, text encoding issues, and writing scripts to use web services like Amazon's S3.
We are an equal opportunity employer. Please send your resume and cover letter to jobs at archive dot org with the subject line "A/V Collections Engineer". The Archive thanks all applicants for their interest, but advises that only those selected for an interview will be contacted. No phone calls please.
The Internet Archive is building a new, hosted Digital Archiving service to be launched in 2011 that will provide storage, maintenance and access for the digital collections of memory institutions. The Digital Archive Engineer will be a key player on a small team of people building this service from the ground up.
This new service will interact with several existing projects at the Internet Archive, requiring the ability to understand and integrate technologies written in several different programming languages. The ideal candidate must therefore be resourceful, flexible and enjoy working on a highly collaborative but loosely structured team.
You will help us define the parameters of the service, figure out how to ingest media files from partner institutions and other storage systems and build methods for patrons to access the materials. Your work will range from back end to middleware right on up to the user interface but will initially focus on design and development of the Digital Archive API.
Must Have:This position is based in San Francisco, CA. We cannot consider telecommuters at this time.
Applicants must be able to work in the United States. We are unable to sponsor work visas at this time.
We are an equal opportunity employer. Please send your resume and cover letter to vicky at archive dot org with the subject line "Digital Archive Engineer". The Archive thanks all applicants for their interest, but advises that only those selected for an interview will be contacted. No phone calls please.
Internet Archive is a 501(c)(3) non-profit that was founded to build an Internet library. Its purposes include offering permanent access for researchers, historians, scholars, people with disabilities, and the general public to historical collections that exist in digital format. Now the Internet Archive includes texts, audio, moving images, and software as well as archived web pages in our collections, and provides specialized services for adaptive reading and information access for the blind and other persons with disabilities. Internet Archive is the home of services such as OpenLibrary, the Wayback Machine and Archive-It along with several other open projects.
The Internet Archive is offering really exciting opportunities at the scanning centers. We are looking for volunteers at the Indiana, Toronto, and Princeton scanning centers.
Help us digitize library books to go on-line to be seen by millions of people for years to come! We need your help! We are trying to get your public library books up online and need some volunteers to help our regular non-profit staff. If you can give us some of your time, we can give you and chance to help bring digital knowledge to others both near and far! Come join us!
Position Summary: Internet Archive is a non-profit organization working with 80+world-class universities and libraries to create the world's largest digital open-source library.
We are looking for people who are patient, conscientious and detail oriented to work on this exciting project digitizing books. Basic knowledge of computers, digital files and digital cameras helpful. Pleasant, low-stress work environment. A love of books is a plus.
We are seeking volunteers who can operate a Scribe scanning machine that takes digital photos of books from various collections and puts them online for universal access.
Gain experience in the following fields of endeavor: building an open source digital library, digital photography, and digital scanning software, preservation, presentation and production of digital books, digitizing special collection books from different centuries, understanding copyrights and public domain materials.
Commitment: Assistance is needed Monday through Friday from 8:00am to 5:00pm. Position involves a commitment of a minimum 3 hours and up, at least one day a week, as well as a minimum commitment length of one month. For those interested in bolstering their credentials, we offer a four-hour, one day a week, six-week internship. Applicable fields of endeavor include digital photography, digital media, Non-Profit, Library Science, Computer Science. We will train you on Scribe 2 software.
Interrelations:
The volunteer Scanner will interact with the Coordinator, the professional scanning staff and other volunteers.
Physical/Special Requirements:
THIS IS A NON-PAYING VOLUNTEER POSITION
If interested please send a resume and cover letter expressing your interest to the following:
The Internet Archive is a non-profit organization seeking to provide universal access to all knowledge. We are working with world-class universities and libraries to create the world's largest digital public library. The collections in our online digital library also include audio, video, web sites, and software.
Internet Archive works together with organizations like Creative Commons and EFF to preserve and expand the public domain, the open-source movement, and the commons in general. You will be a part of a multi-national effort that is presently in 5 countries and involves over 4,500 libraries and institutions.
Please read more about Internet Archive here: http://www.archive.org/about/about.php http://openlibrary.org/about