Tamás Gábor Csapó


dr-csapo-tamas-gaborE-mail:  
csapot[at]tmit.bme.hu

Research areas: articulation, text-to-speech, signal processing, human-machine interfaces.

 

Education

2014: PhD-degree: 2013, Thesis title: “Increasing the naturalness of synthesized speech in hidden Markov-model based text-to-speech synthesis” (in Hungarian)
Jan 2014–Jul 2014: Fulbright scholarship, Department of Speech and Hearing Sciences, Indiana University, Bloomington, IN, USA, Research topic: „Investigating tongue movement
during speech with ultrasound”.

2008–2013: PhD studies: program of Informatics at Budapest University of Technology and Economics (BME), Hungary. Fully funded by state scholarship. Summa cum laude, 100%.
2008: MSc-degree: 2008, Thesis title: “Implementation of prosodic variability in Text-To-Speech systems” (in Hungarian).

2003–2008: MSc Engineer in Technical Informatics, BME, Hungary, Major: Next Generation Networks, Cumulated Grade Average: 4.29/5.00
1999–2003: Kossuth Lajos High School (higher level maths class)

 

Academic Positions

2016-: part time research fellow, MTA

Nov 2014–: Assistant research fellow, Speech Technology and Smart Interactions Laboratory (SmartLab), Department of Telecommunications and Media Informatics (TMIT), Budapest University of Technology and Economics (BME), Hungary.

Jan 2014–Jul 2014: Visiting student researcher, Speech Production Laboratory, Department of Speech and Hearing Sciences, Indiana University, Bloomington, IN, USA.
2011–2014: PhD candidate, BME TMIT SmartLab, Hungary.
2008–2011: PhD student, BME TMIT SmartLab, Hungary.

 

Teaching Experience

2016–: Smart City laboratory, (in Hungarian), compiled and supervised a new lab on augmented reality applications. BME.
2015–: Software laboratory – databases, (in Hungarian), taught lectures and rated midterm work. BME.
2014–: Infocommunication, (in English), developed course material for English, taught lectures. BME.
2010–: Human-Computer Interaction, (in English and Hungarian), developed course material, assisted and taught lectures, rated mid-term project work.
2010–: Project Laboratory and thesis writing, (in Hungarian), supervised BSc and MSc students. BME and IU.
2008–2015: Measurement Laboratory, (in English and Hungarian), compiled and supervised a new lab on VoiceXML dialog planning and taught Speech Coding Lab. BME.

 

Supervising

Successfully defended:

2015: BSc-thesis: Zoltán Umlauf, Message handling on Android extended with speech technology, (Beszédtechnológiával kiegészített üzenetkezelő rendszer Androidon), BME TMIT.

2015: BSc-thesis: Dávid Csopor, Ultrasound-based tongue contour tracking with Deep Neural Networks, (Mély neuronhálók alkalmazása ultrahangos nyelvkontúr követésre), BME TMIT.

2012: BSc-thesis: Balázs Bárány, Speech driven remote control for SmartTV, (Beszéd alapú távvezérlő OkosTV-hez), BME TMIT.

2012: BSc-thesis: Barnabás Péter Weller, Design and application of acoustic icons, (Akusztikus ikonok tervezése és alkalmazása), BME TMIT.

2011: BSc-thesis: Roland Porció, Speech synthesis with spontaneous characteristics, (Spontán jellegű beszéd mesterséges előállítása), BME TMIT.

 

Awards, Scholarships, Grants

Jul 2016: NVidia Hardware Grant (Titan X GPU)

Mar 2016: Award, Ministry of National Development, Professional Medal for Information
Society.
Apr 2015: Award: 1st prize, Huszty Dénes Foundation, PhD Thesis Contest.
Jan 2014–Jul 2014: Fulbright scholarship, Indiana University, Bloomington, IN, Unites States of America.
Jan 2014: Grant: Hungarian Academy of Engineering, travel grant to Bloomington.
Oct 2013: Award: 3rd prize, BMe research grant, dissemination contest.
Jul 2013: Grant: Campus Hungary, travel grant to 8th Speech Synthesis Workshop.
May 2013: Award: 1st prize, Microsoft No Time to W8 contest, for the „Weather for All”
application (together with colleagues from BME-TMIT).
Apr 2010: Grant: Acoustical Society of America, International Student Grant.
Sep 2009: Grant: Bizáky Puky Péter Foundation, travel grant to Interspeech 2009.
May 2009: Award: 3rd prize, Audio Engineering Society (Hungary), MSc Thesis Contest.
Nov 2007: Award: 2nd prize, Scientific Students’ Associations Annual Conference of BME.
Nov 2007: Award: 1st prize, Scientific Students’ Assoc. Ann. Conf. of BME (in German).
Sep 2007–Aug 2008: Scholarship of the Hungarian Republic, awarded by the Ministry of Education of the Hungarian Republic.
Sep 2007–Jan 2008: Scholarship of the Faculty, Faculty of Electrical Engineering and Informatics, BME.
Sep 2007–Jan 2008: Scholarship of the University, Budapest University of Technology and Economics.
Aug 2007: Grant: International Speech Communication Association, travel grant to Interspeech 2007.
Apr 2007: Award: 1st prize, National Conference of Scientific Students’ Associations.
Nov 2006: Award: 1st prize, Scientific Students’ Association Annual Conference of BME.

 

Research Projects, Grants

2016–: MTA-ELTE Lendület Lingual Articulation Research Group – funded by the Momentum program of Hungarian Academy of Sciences, leader: Alexandra Markó

Jan 2014–Jul 2014: Fulbright scholarship, Indiana University, Bloomington, IN, Unites States of America. Supervisor: Dr. Steven M. Lulich.

 

Professional Activities:

Review-services: IEEE Signal Processing Letters, Journal on Multimodal User Interfaces, Intelligent Decision Technologies, International Journal of Speech Technology, IETE Technical Review, SPECOM (2016), RADIOELEKTRONIKA (2016), Interspeech (2013), CogInfoCom (2013)

 

Professional association memberships:

International Speech Communication Association
IEEE Signal Processing Society
Scientific Association for Infocommunications Hungary

 


Publications: (MTMT) Altogether 43 publications, of which

  • 15 journal papers (11 international, 4 national)
  • 18 conference papers (13 international, 5 national)
  • 2 book chapters (2 national)
  • 8 others

  1. Markó Alexandra, Csapó Tamás Gábor, Takács Karolina, Listeners' evaluation of voice quality in Hungarian speakers, In: BESZÉDKUTATÁS, vol. 2017, 2017, pp. 55-66 DOI
  2. Kele Xu, Pierre Roussel, Tamás Gábor Csapó, Bruce Denby, Convolutional neural network-based automatic classification of midsagittal tongue gestural targets using B-mode ultrasound images, In: JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, vol. 141, no. 6, 2017, pp. EL531-EL537 DOI cikk
  3. Tamás Gábor Csapó, Géza Németh, Milos Cernak, Philip N Garner, Modeling Unvoiced Sounds In Statistical Parametric Speech Synthesis with a Continuous Vocoder, In: 24th European Signal Processing Conference, EUSIPCO 2016, Budapest, Magyarország, 2016, pp. 1338-1342 DOI Scopus pdf
  4. Milan Sečujski, Branislav Gerazov, Tamás Gábor Csapó, Vlado Delić, Philip N Garner, Aleksandar Gjoreski, David Guennec, Zoran Ivanovski, Aleksandar Melov, Géza Németh, Ana Stojković, György Szaszák, Design of a Speech Corpus for Research on Cross-Lingual Prosody Transfer, In: LECTURE NOTES IN ARTIFICIAL INTELLIGENCE, Budapest, Magyarország, vol. 9811, 2016, pp. 199-206 WoS DOI Scopus
  5. Kele Xu, Tamás Gábor Csapó, Pierre Roussel, Bruce Denby, A comparative study on the contour tracking algorithms in ultrasound tongue images with automatic re-initialization, In: JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, vol. 139, no. 5, 2016, pp. EL154-EL160 WoS DOI Scopus
  6. Bálint Pál Tóth, Tamás Gábor Csapó, Continuous Fundamental Frequency Prediction with Deep Neural Networks, In: European Signal Processing Conference (EUSIPCO 2016), Budapest, Magyarország, 2016, pp. 1348-1352 DOI Scopus pdf
  7. Tamás Gábor Csapó, Géza Németh, Milos Cernak, Residual-based excitation with continuous F0 modeling in HMM-based speech synthesis, In: LECTURE NOTES IN ARTIFICIAL INTELLIGENCE, Budapest, Magyarország, vol. 9449, 2015, pp. 27-38 DOI Scopus pdf
  8. Tamás Gábor Csapó, Steven M Lulich, Error analysis of extracted tongue contours from 2D ultrasound images, In: Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH 2015), Dresden, Németország, 2015, pp. 2157-2161 Scopus pdf
  9. Tamás Gábor Csapó, Géza Németh, Automatic transformation of irregular to regular voice by residual analysis and synthesis, In: Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH 2015), Dresden, Németország, 2015, pp. 613-617 Scopus pdf
  10. Kálmán Abari, Tamás Gábor Csapó, Bálint Pál Tóth, Gábor Olaszy, From text to formants - indirect model for trajectory prediction based on a multi-speaker parallel speech database, In: Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH 2015), Dresden, Németország, 2015, pp. 623-627 Scopus pdf
  11. Tamás Gábor Csapó, Géza Németh, Statistical parametric speech synthesis with a novel codebook-based excitation model, In: INTELLIGENT DECISION TECHNOLOGIES, vol. 8, no. 4, 2014, pp. 289-299 Scopus
  12. Tamás Gábor Csapó, Géza Németh, Modeling irregular voice in statistical parametric speech synthesis with residual codebook based excitation, In: IEEE JOURNAL ON SELECTED TOPICS IN SIGNAL PROCESSING, vol. 8, no. 2, 2014, pp. 209-220 WoS DOI Scopus
  13. Gy Szaszák, T Gábor Csapó, P N Garner, B Gerazov, Z Ivanovski, G Németh, B Tóth, Sečujski, and V Delić, The SP2 SCOPES project on speech prosody, In: Proceedings of DOGS2014 - Digital speech and image processing, Novi Sad, Szerbia, 2014, pp. 2-10
  14. António Teixeira, Annika Hämäläinenc, Jairo Avelar, Nuno Almeida, Géza Németh, Tibor Fegyó, Csaba Zainkó, Tamás Csapó, Bálint Tóth, André Oliveira, Miguel Sales Dias, Speech-centric Multimodal Interaction for Easy-to-access Online Services – A Personal Life Assistant for the Elderly, In: PROCEDIA COMPUTER SCIENCE, Vigo, Spanyolország, vol. 27, 2014, p. 8 WoS DOI Scopus
  15. Tamás Gábor Csapó, Géza Németh, A novel irregular voice model for HMM-based speech synthesis, In: ISCA 8th Speech Synthesis Worksop (SSW8), Barcelona, Spanyolország, 2013, pp. 229-234
  16. Tamás Gábor Csapó, Géza Németh, A novel codebook-based excitation model for use in speech synthesis, In: Cognitive Infocommunications (CogInfoCom), Košice, Szlovákia, 2012, pp. 661-665 WoS DOI Scopus pdf Google scholar
  17. Éva Székely, Tamás Gábor Csapó, Bálint Tóth, Péter Mihajlik, Julie Carson-Berndsen, Synthesizing Expressive Speech from Amateur Audiobook Recordings, In: IEEE Workshop on Spoken Language Technology, Miami, Amerikai Egyesült Államok, 2012, pp. 297-302 WoS DOI Scopus pdf
  18. Gráczi TE, Lulich SM, Csapó TG, Beke A, Context and speaker dependency in the relation of vowel formants and subglottal resonances : Evidence from Hungarian, In: Interspeech 2011, 12th Annual Conference of the International Speech Communication Association, Firenze, Olaszország, 2011, pp. 1901-1904 WoS Scopus pdf
  19. Géza Németh, Gábor Olaszy, Tamás Gábor Csapó, Spemoticons: Text-To-Speech based emotional auditory cues, In: ICAD 2011, Budapest, Magyarország, 2011, pp. 1-7 pdf Google scholar
  20. Tamás Gábor Csapó, Csaba Zainkó, Géza Németh, A Study of Prosodic Variability Methods in a Corpus-Based Unit Selection Text-To-Speech System, In: INFOCOMMUNICATIONS JOURNAL, vol. LXV, no. 1, 2010, pp. 32-37 pdf
  21. Csapó TG, Bárkányi Zs, Gráczi TE, Bőhm T, Lulich SM, Relation of formants and subglottal resonances in Hungarian vowels, In: 10th annual conference of the International Speech Communication Association 2009 (INTERSPEECH 2009), Egyesült Királyság / Anglia, 2010, pp. 484-487 WoS Scopus pdf
  22. Csaba Zainkó, Tamás Gábor Csapó, Géza Németh, Special Speech Synthesis for Social Network Websites, In: LECTURE NOTES IN ARTIFICIAL INTELLIGENCE, vol. 6231, 2010, pp. 455-463 WoS DOI Scopus Google scholar
  23. Géza Németh, Tamás Gábor Csapó, Bálint Tóth, Improving the Quality of Unit Selection and HMM based Speech Synthesis, 2009 link
  24. Csapó TG, Gráczi TE, Bárkányi Zs, Beke A, Lulich SM, Patterns of Hungarian vowel production and perception with regard to subglottal resonances, In: PHONETICIAN, vol. 99-100, 2009, pp. 7-28 link
  25. Németh G, Fék M, Csapó T G, Increasing Prosodic Variability of Text-To-Speech Synthesizers, In: Interspeech 2007, Antwerpen, Belgium, 2007, pp. 474-477 WoS Scopus Google scholar