ASV-Label
Login

16px-feed-icon Stellenangebote View this page in English

Jobs bei mapegy (Technology Intelligence aus Berlin)

Ansprechpartner: Dr. ing. Peter Walde

Named Entity Matching for companies and persons

Entity matching (also referred to as duplicate identification, record linkage, entity
resolution or reference reconciliation) is a crucial task for data integration and data
cleaning in the process of information refinement. It is about identifying entities referring
to the same real-world entity.
Entities considered are companies and persons in the patent database of the startup
Mapegy UG. The task of the theses will be

  1. surveying the state of the art of named entity matching,
  2. developing the technical framework of a well performing named entity matching of
    patent assignees and inventors respectively and
  3. examining it empirically in a real business case.

For a quick overview refer to:

  1. Köpcke, Rahm: „Frameworks for entity matching: A comparison“, Data Knowl. Eng.
    (2009).
  2. Moreau, Yvon, Cappé (2008): „Robust Similarity Measures for Named Entities
    Matching“, Proceedings of the 22nd International Conference on Computational
    Linguistics (Coling 2008), pages 593–600, Manchester.

Quality Assurance in model-driven process development

Quality Assurance (QA) is an important, omnipresent and continuous part of software
development. Mapegy develops complex algorithms and processes so as to find the
crucial facts from a vast amount of text data. In order to obtain reliable information QA
gets day and more important.
The task of the theses is

  1. surveying the state of art of quality assurance,
  2. developing a technical prototype of an efficient QA process or
  3. a qualified specification sheet therefore,
  4. examining it empirically in a real business case.

For a quick overview refer to:

  1. www.rapid-i.com
  2. Petrasch, Meimberg (2006): “Model Driven Architecture”
  3. Fowler (2004): “UML konzentriert”
  4. Ludewig, Lichter (2007): “Software Engineering“

Optimization of distributed model-driven business intelligence processes

In a world of increasing complexity, rapid change and data deluge it is crucial to take
business decisions quickly and efficiently based on facts and figures. In order to find the
crucial facts in todays digital world huge data sets have to be scanned and calculated.
The task of the theses is

  1. surveying the state of art of model-driven architectures and distributed processes,
  2. (re)developing and distributing of business intelligence processes,
  3. managing QA for these developed processes,
  4. examining it empirically in a real business case.

For a quick overview refer to:

  1. Sharifat, Reif, Kofler, Breuel (2010): “Pattern Recognition Engineering” Proceedings of
    RCOMM 2010
  2. Petrasch, Meimberg (2006): “Model Driven Architecture”
  3. Ludewig, Lichter (2007): “Software Engineering“
  4. Agarwal, Rudolph, Abecker (2008): “Semantic Description of Distributed Business
    Processes“ Proceedings of Fourth IEEE International Conference on Semantic
    Computing.